postgresql – How to replace multiple special characters in Postgres 9.5

postgresql – How to replace multiple special characters in Postgres 9.5

replace()

If you want just to replace one or few characters you can use function replace(string text, from text, to text) that replaces all occurrences in string substring. The replace function can be used to replace one character to several characters.

translate()

If you want to translate some letters to other letters you can user function translate(string text, from text, to text) that replaces any character in a string that matches a character in the from by the corresponding character in the to set.

Some data to play with:

drop table if exists xyz;

create table xyz (
    id serial not null,
    name varchar(30)
);

insert into xyz (name) values
    (Juhänäo),
    (Jürgüen),
    (Dannäu),
    (Übüdyr);

Example of replace function:

select replace(name, ä, a) from xyz;

This function replaces letter ä in the name column with letter a. Juhänäo becomes Juhanao.

select replace(name, ä, ae) from xyz;

Now it replaces letter ä with ae.

select replace(replace(replace(name, ä, ae), ü, ue), Ü, Ue) from xyz;

Not very nice, but in the example all ä become ae, ü become ue, and Ü become Ue.

update xyz set name = replace(replace(replace(name, ä, ae), ü, ue), Ü, Ue);

Changes letters and updates rows. The result of the update is following:

Juhaenaeo
Juergueen
Dannaeu
Uebuedyr

Example of translate function:

select translate(name, ä,ü,Ü, a,u,U) from xyz;

Translates all letters ä to a, ü to u and Ü to U.

update xyz set name = translate(name, ä,ü,Ü, a,u,U);

Updates table so all predefined letters are translated and the change is saved to the database. The result of the update is following:

Juhanao
Jurguen
Dannau
Ubudyr

More information:

Replace characters with multi-character strings

Postgresql string functions

No, there are no this function. Probably is not to hard to write optimized C extension what does it. But C language is not necessary always. You can try SQL or PLpgSQL function:

CREATE OR REPLACE FUNCTION xx(text, text[], text[])
RETURNS text AS $$
   SELECT string_agg(coalesce($3[array_position($2, c)],c),)
      FROM regexp_split_to_table($1,) g(c)
$$ LANGUAGE sql;

postgres=# select xx(Jürgen, ARRAY[ä,ü], ARRAY[ae,ue]);
┌─────────┐
│   xx    │
╞═════════╡
│ Juergen │
└─────────┘
(1 row)

On my comp it does 6000 transformation under 200ms (but I have developer build of PostgreSQL – it is slower).

postgresql – How to replace multiple special characters in Postgres 9.5

If you are after German letters, then this works:

CREATE OR REPLACE FUNCTION public.udf_transliterate_german(
    german_word character varying)
    RETURNS character varying
    LANGUAGE sql
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
SELECT REPLACE(
    REPLACE(
        REPLACE(
            REPLACE(
                REPLACE(
                    REPLACE(
                        REPLACE(
                            REPLACE(german_word,
                                ä,ae),
                            ö,oe ),
                        ü,ue),
                    ß,ss),
                Ä, AE),
            Ö, OE),
        Ü, UE),
    ẞ, SS);
$BODY$;

It is not elegant though.

Leave a Reply

Your email address will not be published. Required fields are marked *