How To Reconstruct String Sections From Concatenated String Format

Posted by Ahmed Tarek Hasan on 5/24/2013 04:42:00 PM with No comments
 How To Reconstruct String Sections From Concatenated String Format

Sometimes you find yourself in a need to pass some values from one module to another beside being restricted to using a string format not a fully qualified object. This restriction may be due to performance purposes or some technical restrictions like using a dictionary object which only provides you with a "key" and "value" pairs.

There is more than one way to deal with such situation like using serialization. But, I think that the simplest way to go with in this situation is using string concatenation but keep in mind that this approach needs that every single value you wish to pass can be represented into a string format.

For sure every object/value can be represented into a string format but not always using the string format is the best choice. If the object is too complicated then trying to construct the object from its string format will be somehow hard, not impossible, but hard.

So, assume that we decided to use the string concatenation approach. This is done as follows:
  1. Constructing the string format: concatenate all sections/values into one string while using a certain character or set of characters as a separator between each two sections/values
  2. Reconstructing sections/values: split the constructed string by the same character or set of characters you used in the constructing part

Here we have a problem that whatever the character or set of characters we will use as a separator between sections/values there is always a probability that the sections/values may also contain this character or set of characters in the first place. This will cause a problem when splitting the constructed string as the returned sections/values will be deformed.

So, to overcome this problem we will use the same approach but with some extra steps.

Analysis
  • Let's say that we will use "#;" as our separator
  • Also let's say that the sections/values we want to work on are "Ahm#;ed" and "Tar#ek"
  • So, if we just concatenated the sections using the separator the result will be "Ahm#;ed#;Tar#ek"
  • Now when splitting to reconstruct values the result will be "Ahm", "ed" and "Tare#ek"
  • This is wrong
  • So, since that the "#;" found in the "Ahm#;ed" is causing us problems, let's deal with it first
  • So, we will first replace "#;" in "Ahm#;ed" to something else so that it will not confuse us while splitting
  • But wait, if we replace it to another character or set of characters won't this cause the same problem we are trying to avoid in the first place???
  • Say that we will replace "#;" in "Ahm#;ed" to "<>", then this will work because we already know that "Ahm#;ed" doesn't contain "<>". But what if at run-time one of the values we are dealing with already contain "<>"??? Then, we will have the same problem of invalid splitting, right?
  • Ok, this is confusing but we have a solution
  • Let's get back a few points, we said that the "#;" found in the "Ahm#;ed" is causing us problems
  • We tried before to replace "#;" as a whole although we can only replace a part of it
  • Also, the something to replace to should not be a completely different thing
  • Confused, let's check this example
  • We will replace "#" in "Ahm#;ed" to "#&", so the result will be "Ahm#&;ed"
  • The same for "Tare#ek", it will be "Tare#&ek"
  • So after concatenation it will be "Ahm#&;ed#;Tare#&ek"
  • So after splitting the sections will be "Ahm#&;ed" and "Tare#&ek"
  • Finally replace the "#&" to "#" in all sections
  • Then the sections will be "Ahm#;ed" and "Tare#ek" which are the same sections we started with

Conclusion
  1. Decide a separator ("#;")
  2. Decide the part to replace in each section given that it is a part of the separator ("#")
  3. Decide the string to replace to ("#&")
  4. While concatenation, replace each occurrence of "#" in each section to "#&" then concatenate using "#;"
  5. While splitting, split by "#;" then replace each occurrence of "#&" in each section to "#"
  6. That's it, you now have your sections as they are without any deformation

Now, let's see some code.

The code below represents a class which encapsulates the logic described above.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Globalization;

namespace DevelopmentSimplyPut.Utilities
{
    public class EmployeeToken
    {
        #region Properties
  private string firstName;
        public string FirstName
        {
            get
   {
    return firstName;
   }
        }

  private string secondName;
        public string SecondName
        {
            get
   {
    return secondName;
   }
        }
  
  private string concatenatedSections;
        public string ConcatenatedSections
        {
            get
   {
    return concatenatedSections;
   }
        }
  
  public string NamesSeparator
        {
            get
   {
    return "#;";
   }
        }
  
  public string NamesStringToReplace
        {
            get
   {
    return "#";
   }
        }
  
  public string NamesToReplaceTo
        {
            get
   {
    return "#&";
   }
        }
        #endregion Properties

        #region Constructors
        public EmployeeToken(string _firstName, string _lastName)
        {
            firstName = _firstName;
            lastName = _lastName;
   
   concatenatedSections =
    string.Format(CultureInfo.InvariantCulture, "{0}{1}{2}"
                    , Encode(firstName)
                    , NamesSeparator
                    , Encode(lastName));
        }
  
  public EmployeeToken(string _concatenatedSections)
        {
   concatenatedSections = _concatenatedSections;
   
            if (!string.IsNullOrEmpty(id))
            {
                string[] separators = { NamesSeparator };
                stirng[] sections = concatenatedSections.Split(separators, StringSplitOptions.None);

                if (null != sections && sections.Length == 2)
                {
                    firstName = Decode(sections[0]);
                    lastName = Decode(sections[1]);
                }
            }
        }
        #endregion Constructors

        #region Utilities
        private static string Encode(string str)
        {
            return ((str == null) ? string.Empty : str.Replace(NamesStringToReplace, NamesToReplaceTo));
        }
        private static string Decode(string str)
        {
            return ((str == null) ? string.Empty : str.Replace(NamesToReplaceTo, NamesStringToReplace));
        }
        #endregion Utilities
    }
}

So now if you try to use this class you will get your results right
EmployeeToken newEmp = new EmployeeToken("Ahm#;ed", "Tar#ek");
Console.WriteLine(newEmp.ConcatenatedSections);
Console.WriteLine(newEmp.FirstName);
Console.WriteLine(newEmp.SecondName);

EmployeeToken newEmp1 = new EmployeeToken(newEmp.ConcatenatedSections);
Console.WriteLine(newEmp1.ConcatenatedSections);
Console.WriteLine(newEmp1.FirstName);
Console.WriteLine(newEmp1.SecondName);


That's it. Hope you find this useful.



Categories: , ,