|
|
|||||
|
|
|
WinDev Regular expressions: From A to Z
|
|
||
|
Would you believe it? During all these years I never had to use regular expressions for anything else than retrieving files (i.e. using * and ?)... Suddenly, I have to incorporate content coming from web pages into a database, and WinDev's "MatchRegularExpression" function seems to be very handy for that... Except that the corresponding help page is... Let's say... Lighter than I would have liked it... As usual, WinDev help file on a subject presuppose that you already know a lot about the subject in general (i.e. not in WinDev)... If it's not the case, refer to a book on the thing, whatever the thing is... My purpose here is therefore to give you that basic background and group information I found in different places, different help pages and even just by trying things. So first, what are regular expressions? Basically, it's a technique that allows you to easily match the content of a string to a predefined format (like the input/display formats used in WinDev fields). The difference is that you can manage much more with a regular expression than a 9999.99 format. The second obvious question is: where should or
could I use regular expressions in WinDev? Two answers: By example, you can 'easily' create a mask that will allow you to enter between 1 & 4 uppercased letters, than 1 number, than 1 number or the letter X, than 4 letters... The same expression will allow you to verify that a string pasted or imported has the correct format for a field. Here's an extract of WinDev help page about regular expression syntax that I translated for you:
Now if you read the page a little further down, you will find out that in the case where you want to extract parts of the string in several variable, you also have: ( ) Limit
of one part of the format you want to extract And that's the first main error in the help page: these two elements are available whether you extract parts of the string or not (i.e. if you check a string with MatchRegularExpression or in a field input format...) So let's see now what I tried and found out was working, whether it was written in the help file or not. I'm also going to list things that are not working or are that you should be careful about: - MatchRegularExpression will generate an exception
for some expressions' strings! By example: ([A-z0-9]+)( - A regular expression is built by describing group
of characters after group of characters: - The expressions are case sensitive: (A-Z), (a-z), and (A-z) are not the same at all - Clearly the expressions are ASCII dependent, and if the 128 first characters of the ASCII table are standard, the other 128 are language dependent... Which means that depending of the languages you are managing, your regular expressions can be different. A good idea is to store them in your program as international strings. A good ASCII table is clearly necessary when working on that domain, so here's one online! - You can have several groups of characters inside one block: (A-Z) or (0-9) is valid, but also by example [A-Z0-9a-z], which means anything between A and Z, anything between 0 and 9 and anything between a and z - You can add characters that are not in an interval. By example, the following syntax means anything between A and Z and also spaces and exclamation points: [A-Z !] - You can also include special characters in the usual WinDev Syntax: "[A-Z"+CR+"]" is valid and means that your string can contain any character between A and Z or a carriage return. - Contrary to what is said in the help of inputmask, the same regular expression syntax is valid for all 3 cases: MatchRegularExpression with or without extraction and ..Inputmask. If you use things that are not relevant (by example ( and ) in an input mask, it's just ignored. And by the way: [A-Za-z]{0,1}[0-9]{0,1} and [A-Za-z][0-9] are NOT identical: the first one means 0 or 1 letter, uppercase or lowercase, and 0 to 1 number... The second means 1 letter and 1 number (0 of any of them is forbidden). (new--) Thanks to Eric L. who answered me on PCSoft WinDev Forum, there's more to say about that point. You CAN technically use [A-Za-z][0-9] in an input mask, but you wont be able to enter a value in the field, as it requests at ALL TIME one letter and one number... You will only be able to past a string in the field if it's valid. Therefore there is no way (it seems) to use a regular expression for an input mask of 1 letter AND 1 number. If you use the first one, you cannot enter the letter alone (before the number), and if you use the second one, it's now possible NOT to enter one of them. In order to be able to test this point more easily, I improved my testing utility by adding a field using ..inputmask with the current expression. (--new) Merci de ta remarque... Je n'avais pas considéré lez choses sous cet angle... Et tu as raison ! Ca implique donc la chose suivante: il n'y a pas d'équivalent exact à [A-Za-z][0-9] pour un masque de saisie... En effet, si on utilise [A-Za-z]{0,1}[0-9]{0,1} pour le masque, il est possible de saisir simplement 'X' ou simplement '4', mais rien n'OBLIGE avec ce masque à saisir un de chaque... Alors que [A-Za-z][0-9] spécifie une lettre ET un chiffre. La morale de l'histoire est qu'il faut considérer le cas du masque de saisie comme différent et tenir compte de chaque étape de la saisie pour avoir un masque valide... Pour pouvoir tester plus facilement, j'ai ajouté un champ de saisie utilisant ..masquesaisie dans mon utilitaire de test... je vais le publier de ce pas et modifier ma page en fonction de tes remarques ---> eric l. - The syntax to use for ..Inputmask is MyField..inputmask="regexp:"+ RegularExpression After all that, I'm still not a regular expression guru, and frankly, I don't want to become one :-) So I extended my test window to allow the following
things: You can download it by clicking on one of those links:
|
|||||
| Links: |
|
||||