Split rows by abbreviation in Azure Data Factory

Taylormmmmm
June 15, 2023
260 views
0 votes
2 Answers

I am trying to split my column ‘State’ by abbreviating only the first 2 letters and also making it uppercase in the final output. As you can see from the example there is Ohio, Wisconsin, and Illinois at the top. I just want the output to be OH, WI, IL etc..

State example

I have used upper but not sure what else to use to shorten the length of the abbreviation.

Answers

- SaideepArikontham
- June 15, 2023 at 6:46 am
- 0 votes
0
- You can use the azure dataflow’s upper() and left functions to achieve this requirement. The following is the sample data that I have taken as source:
- Now using the derived column transformation as shown below, you can get the required output:
```
upper(left(State,2))
```
- The following is how the result would appear in the final data preview:
NOTE: This does not account for states having same first 2 letters and thus causing a problem. For example, North Carolina and North Dakota or Tennessee and Texas. These cases should be handled separately if they are not handled already.
Login or Signup to reply.

- divyang4481
- June 15, 2023 at 6:52 am
- 0 votes
0
1. Add a "Copy Data" activity to your pipeline to read the source data.
  Configure the source dataset and linked service according to your data source.
2. Add a "Derived Column" activity after the "Copy Data" activity.
  3.Configure the "Derived Column" activity:
  Select the source dataset and linked service.
  In the "Derived Column" tab, click on the "+ Add" button to add a new derived
  column.
  
  Provide a name for the derived column (e.g., "Abbreviation").
  
  In the "Expression" field, use the "split" and "upper" functions to split the
  rows by the abbreviation and convert it to uppercase.
  
  For example, if your data has a column named "Description" and the abbreviation
  is enclosed in square brackets (e.g., "[Ohio]"), you can use the following
  expression:
  
  upper(split(Description, ‘[‘)[1])
This expression splits the "Description" column by the ‘[‘ character, retrieves the second element of the resulting array (which represents the abbreviation), and converts it to uppercase using the "upper" function.
Add a "Sink" activity after the "Derived Column" activity to write the data to the desired destination.
Configure the sink dataset and linked service according to your destination
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.