skip to Main Content

I have simple string value line which contain below content.

logo Name raj mobile 9038874774 address 6-98 india bill auto generated

Now I am trying key/Value pair to achieve my detail and the pair value output expecting like below

[0] Key: Name  value:Raj
[1] Key: Mobile value:9038874774
[2] Key: Address value:6-98 india

Below is code trying to achieve requirement

string[] lines = new string[] { "logo Name raj mobile 9038874774 address 6-98 india bill auto generated" };
   
// Get the position of the empty sign within each line

var pairs = lines.Select(l => new { Line = l, Pos = l.IndexOf(" ") });

// Build a dictionary of key/value pairs by splitting the string at the empty sign
var dictionary = pairs.ToDictionary(p => p.Line.Substring(0, p.Pos), p => p.Line.Substring(p.Pos + 1));

// Now you can retrieve values by key:
var value1 = dictionary["Name"]; 

Below is output looks like into debugger

enter image description here

The text string have contain some non-required words like logo and bill auto generated no need of this words to required into key/value pairs. Please suggest how to achieve this max accurately and data of string getting from image file converted into text string using terrasact OCR

2

Answers


  1. Here’s an example using string.Split. I changed line to match the case of the keys, so you might have to deal with case issues. Also, I’m assuming Bill is a key that is safe to be ignored (same concern that @Mark Seemann raised in the comments.)

    There are other potential key issues though, for example, what if the name value is Bill?

    private static readonly HashSet<string> _extractKeys = new() { "Name", "Mobile", "Address" };
    private static readonly HashSet<string> _ignoredKeys = new() { "Bill" };
    
    public static void Main(string[] args)
    {
        var line = "logo Name raj Mobile 9038874774 Address 6-98 india Bill auto generated";
        var splitLine = line.Split(' ');
    
        var pairs = new Dictionary<string, string>();
    
        for (var i = 0; i < splitLine.Length; i++)
        {
            var candidateKey = splitLine[i];
            if (!_extractKeys.Contains(candidateKey))
            {
                continue;
            }
    
            var value = "";
            for (var v = i + 1; v < splitLine.Length; v++)
            {
                var candidateValuePart = splitLine[v];
                if (_ignoredKeys.Contains(candidateValuePart) || _extractKeys.Contains(candidateValuePart))
                {
                    i = v - 1;
                    break;
                }
    
                value = value + candidateValuePart + " ";
            }
    
            pairs.Add(candidateKey, value.Trim());
        }
    
        foreach (var kv in pairs)
        {
            Console.WriteLine("{0}: {1}", kv.Key, kv.Value);
        }
    }
    
    Login or Signup to reply.
  2. This really looks like a parsing problem, but a quick-and-dirty implementation might be something like this:

    private static IDictionary<string, string> Parse(string input)
    {
        var keys = ImmutableHashSet.Create(
            StringComparer.OrdinalIgnoreCase, "Name", "Mobile", "Address");
        var ignoredKeys = ImmutableHashSet.Create(StringComparer.OrdinalIgnoreCase, "Bill");
        var allKeys = keys.Union(ignoredKeys);
        var dict = new Dictionary<string, string>();
    
        string? currentKey = null;
        foreach (var word in input.Split(' '))
        {
            if (allKeys.TryGetValue(word, out var key))
            {
                if (key != null)
                    dict[key] = "";
                currentKey = key;
            }
            else if (currentKey != null && dict.TryGetValue(currentKey, out var s))
                dict[currentKey] = (s + " " + word).Trim();
        }
    
        return new Dictionary<string, string>(dict.ExceptBy(ignoredKeys, kvp => kvp.Key));
    }
    

    Here, I’m assuming that "bill" is also a sort of key that should just be ignored.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search