🤔

Parsing input strings#

Issue#

When working with strings, they may have invisible characters appended. When referencing an input field, take care to not reference text field directly. In other cases, these hidden characters must be removed. They will also cause issues when attempting to compare strings.

Resolution#

TextMesh Pro input fields#

When referencing text from a TMP input field, do not reference the underlying TextMeshProUGUI, reference the TMP_InputField itself.

warning

The child TextMeshProUGUI uses a zero-width space for layout purposes, and should not be referenced.

// 🟢 Correctly referencing the input field.
[SerializeField]
private TMP_InputField _inputField;

public void UseInput()
{
    string text = _inputField.text;
    // Use text
}

🔴 Incorrectly referencing the text component instead of the input field.// 🔴 Incorrectly referencing the text component instead of the input field.
[SerializeField]
private TextMeshProUGUI _inputText;

public void UseInput()
{
    string text = _inputText.text;
    // Use text
}

Use TryParse!#

It's almost always worthwhile to use the TryParse variants of parsing functions to ensure proper handling of a parsing failure.

if (!int.TryParse(input, out int result))
{
Debug.LogError($"{input} failed to parse to an integer value.");
return;
}
// Use result

Trimming whitespace#

To remove whitespace from strings you can generally use the .Trim() function, which returns a modified string. More complex removal may require the use of Regex, a complex and powerful language used to build patterns for searching text. Search for regex tutorials to get started on that journey.

The Trim() function may suffice when removing invisible characters from a user, but the child TMP object has an appended character that would need to be removed more manually:

input = input.Trim('\u200b');

Debugging#

  1. Check the Length of your string and compare it to what you expect.
  2. Index into the string to find the problematic character, and convert the result to hex. String is UTF-16, look up what the corresponding character is.
Debug.Log($"The problematic character is U+{(int)text[index]:X4}");

Common hidden characters#

Character as code UTF-16 Description Shorthand
U+0020 Space SP
\n U+000A Line feed LF
\r U+000D Carriage return CR
\t U+0009 Character tabulation TAB
\u200b U+200B Zero-width space1 ZWSP

\r\n, or CRLF is a common combination to denote a new line on Windows. macOS and Unix both use \n, LF.

  1. See the first resolution on this page if you are encountering a zero width space.