You are working at a Data Analytics company, and in one of your projects, you need to load data from a source into a target table using Talend. However, while running the Talend Job, the data will fail to load due to parsing and formatting issues.
Luckily, Talend provides several useful built-in functions for formatting dates. There are two main ways to access these functions:
1.Using a tMap component
To apply date formatting within a tMap, follow these steps:
→ Double-click the tMap component to open it.
→ On the output side, locate the date column that needs formatting.
→ Click on the three dots (…) to the right of the output date column.
→ In the Expression Builder, you will see the link to the input column (the one you want to convert).
→ At the bottom of the Expression Builder, expand the Categories section.
→ Select the appropriate function from the list.
→ Click Ok to confirm the expression.
→ Click Apply, then Ok to close the tMap.
2. Using Global Routines
Talend provides a set of built-in global routines for handling common operations such as date parsing and formatting. These routines can be used directly across multiple components, offering more flexibility than using expressions only within a tMap.
You can explore and understand these routines, along with usage examples, by following these steps:
→ In Talend Studio, go to the Repository panel.
→ Expand Code > Global Routines > system.
→ Double-click the appropriate class to open it in the code editor on the Canvas.
For this scenario, we are particularly interested in the TalendDate class. If you double-click on TalendDate, you will see several functions like formatDate() and parseDate(). You can scroll through the file to view documentation and examples for each function.
Let’s now switch to Talend and walk through an example.
You’ve received data in a CSV file, and you need to format the date values before inserting them into the target table. For simplicity, in this example, we’ll output the results using a tLogRow component instead of writing to a database.
The date values in the CSV are in the format MM-dd-yyyy, and they need to be converted to yyyy-MM-dd.
We will use the following components in Talend:
1.tFileInputDelimited
This component is used to read the data from the CSV source file.
Double-click on the component and fill in the following fields:
→ File name/Stream: Enter the full path to the CSV file, including the file name.
→ Field Separator: Enter a comma (,) if the file is comma-separated.
→ Header: If the CSV file includes a header row, insert 1; otherwise, leave it to 0.
2. tMap
We use this component to map input columns to output columns and to format the date column. In this case, we will use the parseDate() and formatDate() functions to transform the date format.
→ Link tMap component to tFileInputDelimited using a Main.
→ Double-click on tMap to open it.
→ Map input columns to output columns.
→ Click on the three dots (…) next to the order_date column.
→ In the Expression Builder window select the parseDate() and formatDate() functions.
→ Click ok.
→ Then, click Ok and Apply to close tMap.
3. tLogRow
This component outputs the data, and we will be able to see what the output data looks like before inserting it into the target database.
→ Link tLogRow to tMap using a Main.
→ Go to the Basic Settings tab, and select the Table Mode – the output will be displayed in a table style.
→ Make sure the columns are synchronised.
Now, you can run the Job. Open the Run tab and click on Run. The output will be displayed in the console. Check if the order_date column has the required format.