C # Regex in examples

The translation of the article was prepared specifically for students of the course "C # Developer" .




The Regex class implements regular expressions in C #. In this article, you will learn how to use C # Regex to validate various user inputs.

Regex in C # implements regular expressions. The C # Regex class offers methods and properties for analyzing large text in order to search for character patterns. In this article, you will learn how to use the .NET Regex class in C #.

Regular expressions


A regular expression is used to check if a string matches a pattern. A regular expression (regular expression or regex, or regexp) is a sequence of characters that defines a pattern. A pattern can consist of literals, numbers, characters, operators, or constructs. The pattern is used to find matches in a string or file.
Regular expressions are often used when checking input, parsing and finding strings. For example, checking a valid date of birth, social security number, full name, in which the name and surname are separated by a comma, searching for the number of occurrences of a substring, replacing substrings, date formats, valid email formats, currency format, and so on.

Class regex


In .NET, the Regex class represents a regex engine. It can be used to quickly parse large amounts of text to find specific character patterns, extract, edit, replace, or delete text substrings, and add the extracted strings to the collection to generate a report.

The Regex class is defined in the System.Text.RegularExpressions namespace. The constructor of the Regex class takes a template string as a parameter along with other optional parameters.

The following code snippet creates a regular expression from a pattern. Here, the pattern corresponds to a word starting with the letter “M”.

//    ,     "M" string pattern = @"\b[M]\w+"; //   Regex Regex rg = new Regex(pattern); 


The following code fragment contains a long text with the names of the authors that need to be analyzed.

 //   string authors = "Mahesh Chand, Raj Kumar, Mike Gold, Allen O'Neill, Marshal Troll"; 


The Matches method is used to find all matches in a regular expression and returns a MatchCollection.

 //    MatchCollection matchedAuthors = rg.Matches(authors); 


The next piece of code goes through a collection of matches.

 //     for (int count = 0; count < matchedAuthors.Count; count++) Console.WriteLine(matchedAuthors[count].Value); 


Here is the complete code:

 //    ,     "M" string pattern = @"\b[M]\w+"; //   Regex Regex rg = new Regex(pattern); //   string authors = "Mahesh Chand, Raj Kumar, Mike Gold, Allen O'Neill, Marshal Troll"; //    MatchCollection matchedAuthors = rg.Matches(authors); ///     for (int count = 0; count < matchedAuthors.Count; count++) Console.WriteLine(matchedAuthors[count].Value); 


In the above example, the code looks for the character "M". But what if the word begins with "m." The following code snippet uses the RegexOptions.IgnoreCase parameter so that Regex is not case sensitive.

 //    ,     "M" string pattern = @"\b[m]\w+"; //   Regex Regex rg = new Regex(pattern, RegexOptions.IgnoreCase); 


Replacing multiple spaces with Regex


The Regex.Replace() method is used to replace the matched string with a new string. The following example searches for multiple spaces in a string with a single one.

 //      string badString = "Here is a strig with ton of white space." ; string CleanedString = Regex.Replace(badString, "\\s+", " "); Console.WriteLine($"Cleaned String: {CleanedString}"); 


The following code snippet replaces spaces with '-'.

 string CleanedString = Regex.Replace(badString, "\\s+", "-"); 


Splitting a string into characters using Regex


In the following example, the regular expression pattern [az] + and the Regex.Split() method are used to split the string into characters, not Regex.Split() .

 //     string azpattern = "[az]+"; string str = "Asd2323b0900c1234Def5678Ghi9012Jklm"; string[] result = Regex.Split(str, azpattern, RegexOptions.IgnoreCase, TimeSpan.FromMilliseconds(500)); for (int i = 0; i < result.Length; i++) { Console.Write("'{0}'", result[i]); if (i < result.Length - 1) Console.Write(", "); } 


Regular Expressions in C #


Regular expressions are the pattern matching standard for parsing and changing strings, and allow the user to express how a computer program should look for the specified pattern in the text, and then what it should do when each match is found with the given pattern. Sometimes they are abbreviated as "regex". They are a powerful way of finding and modifying strings that take on a specific format.

Here is a simple C # code example that shows how regular expressions are used.

 using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Text.RegularExpressions; namespace RegularExpression1 { class Program { static void Main(string[] args) { Regex r = new Regex(@"^\+?\d{0,2}\-?\d{4,5}\-?\d{5,6}"); //  Regex    . string[] str = { "+91-9678967101", "9678967101", "+91-9678-967101", "+91-96789-67101", "+919678967101" }; //         . foreach (string s in str) { Console.WriteLine("{0} {1} a valid mobile number.", s, r.IsMatch(s) ? "is" : "is not"); //  IsMatch    ,   ,     . } } } } 


Here is a detailed explanation of regular expressions and their use in C # and .NET:
Regular Expressions in C #

Regex for checking email


To test multiple email addresses, we can use the following regular expressions. We separate addresses using the delimiter ';'

^((\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*)\s*[;]{0,1}\s*)+$

If you want to use the delimiter ',' then use the following

^((\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*)\s*[,]{0,1}\s*)+$

and if you want to use both delimiters ',' and ';' then use this

^((\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*)\s*[;,.]{0,1}\s*)+$

Thus, using the above regular expression, you can check both one or several addresses at once.

Find out more here: Regex for checking multiple email addresses .

Validating user input with regular expressions


This article explains how to use regular expressions (the Regex class of the System.Text.RegularExpressions namespace) in C # and .NET.

We can use the Regex.Match method, which takes input and a regular expression, and returns success if

 if (!Regex.Match(firstNameTextBox.Text, "^[AZ][a-zA-Z]*$").Success) {} if (!Regex.Match(addressTextBox.Text, @"^[0-9]+\s+([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$").Success) if (!Regex.Match(cityTextBox.Text, @"^([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$").Success) if (!Regex.Match(stateTextBox.Text, @"^([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$").Success) if (!Regex.Match(zipCodeTextBox.Text, @"^\d{5}$").Success) { if (!Regex.Match(phoneTextBox.Text, @"^[1-9]\d{2}-[1-9]\d{2}-\d{4}$").Success) 


Find out more here:
Validating user input with regular expressions


Split string using Regex.split (regex) in C #


In this part, we will learn how to break a string using RegEx in C #. Regex breaks a string based on a pattern. It processes the delimiter specified as a template. This is why Regex is better than string.Split. Here are some examples of how to break a string using Regex in C #. Let's write the code.

To use Regex to split the string, add the following namespaces.

 using System; using System.Text.RegularExpressions; using System.Collections.Generic; 


Example 1:

Separate numbers from strings using Regex.

 string Text = "1 One, 2 Two, 3 Three is good."; string[] digits = Regex.Split(Text, @"\D+"); foreach (string value in digits) { int number; if (int.TryParse(value, out number)) { Console.WriteLine(value); } } 


The code above breaks the string using \ D + and prints the numbers by iterating over the result.

Find out more here:

Split string using regular expressions in C #

Replace special characters from string with regex


In this part, I will tell you how to replace special characters with regular expressions in C #.

If you have a string with special characters and want to remove or replace them, you can use a regular expression for this.

Use the following code:

 Regex.Replace(your String, @"[^0-9a-zA-Z]+", "") 


This code will delete all special characters, but if you do not want to delete some special characters, for example, the comma "," and the colon ":" - make the following changes:

 Regex.Replace(Your String, @"[^0-9a-zA-Z:,]+", "") 


In the same way, you can make changes according to your requirements.

Note:

It should be noted that regular expressions are not a panacea for every little string manipulation. If you need simple parsing provided by the String class or other classes, try and use it.


For further reading



If you are new to regular expressions, I recommend reading this article, Introduction to Regular Expressions .

Here's another article: " Using Regular Expressions in C # "

Source: https://habr.com/ru/post/469989/


All Articles