→ Open Talend using the “Run as administrator” option to ensure you have the appropriate privileges.
→ In the Repository, right-click on Job Designs and select Create job.
→ In the New job window, enter an appropriate name for the Job. You may also optionally add a purpose and description.
→ Click Finish.
→ Context Group for Database Connection.
The database used in this example is MySQL, with a username and password defined during installation, using a localhost connection.
In the Repository, right-click on Contexts and select Create context group.
In the Create/Edit a context group window (Step 1 of 2), enter the name DBConnection. You may also add a purpose and a description if needed.
In the Create/Edit a context group window (Step 2 of 2), click the “+” icon on the right-hand-side. The Configure Contexts window appears, with a default already displayed. Select it and click Edit. In the Rename Context window, type DEV and click OK. Next, click OK again to confirm.
Then, click the “+” icon at the bottom of the Create/Edit a context group window (Step 2 of 2) to add the required rows of database credentials as context variables. Be sure to specify the appropriate data type for each variable. Add the following credentials:
→ Context Group for Logging Purpose
A table called job_log has already been created in the training schema within the MySQL database. This table contains the following columns: id, job_name, purpose, schema_name, job_start_dt, job_end_dt, job_duration, job_status, row_count_exported. The id column is defined as a sequence, so we will not explicitly set it as a context variable, as its value auto-increments.
In the Repository, right-click on Contexts and select Create context group.
In the Create/Edit a context group window (Step 1 of 2), enter the name Logging. You may also add a purpose and a description if needed.
In the Create/Edit a context group window (Step 2 of 2), click the “+” icon on the right-hand side. The Configure Contexts window appears, displaying a default context. Select it and click Edit. In the Rename Context window, type DEV and click OK. Next, click OK again to confirm.
Then, click the “+” icon at the bottom of the Create/Edit a context group window (Step 2 of 2) to add context variables for logging purposes. Be sure to specify the appropriate data type for each variable. For the Value field, we will add some default values to avoid warnings about strings appearing at runtime. These defaults will be overwritten by the actual values defined during execution. Add the following context variables, matching the columns in the MySQL logging table:
→ Buit-In Context Variable for the File Name
Open the Job and go to the Contexts View.
In the Contexts View, click the “+” icon on the right hand-side. The Configure Contexts window appears, displaying a default context. Select it and click Edit. In the Rename Context window, type DEV and click OK. Then, click OK again to confirm.
Next, click the “+” icon at the bottom of the Contexts View to add a context variable for the file name. Be sure to select the appropriate data type. Add the following context variable:
Row 1 – Name: filePath, Type: String, Value: C:/Program Files/RPM/Retail_Shop_Data.xlsx
→ Adding the Context Groups to the Job.
There are three options for adding context groups to the Job:
→ Add a tPreJob component
The tPreJob component is particularly useful for setting up configurations or executing preliminary tasks on which the main Job may depend.
→ Add a tJava component
Link this component to the tPreJob component using an OnComponentOk trigger.
This component is used to initialize some of the context variables. When initializing a context variable in a tJava component, use the following syntax: context.VariableName=<the_value_to _be_assigned>;. Be sure to add a semicolon at the end of each line of code. Initialize the following variables:
Enter the schema name, enclosed in double quotes. This can also be set at either the beginning or end of the flow.
→ Add a tDBConnection component
Link this component to the tJava component using an OnComponentOk trigger.
The tDBConnection component is used to establish a connection to the MySQL Database. Follow these configuration steps:
Click on the component.
From the Database drop-down menu, select MySQL and verify that the DB version is correct.
Configure the context variables for the database connection:
Host: context.Host.
Port: context.Port.
Database: context.Database
Username: context.Username.
Password. Click on the three dots to the right of the Password field, remove any quotes, and type context. Password, and click OK.
Go to the Advanced Settings tab and enable the Auto Commit option. This will ensure that data is automatically committed to the database when the Job finishes.
→ Add a tFileInputExcel component.
This component marks the start of the main flow in the Job. It is used to read the daily data for a specific supermarket.
In the Basic Settings tab, next to File name/Stream, add the Built-In context variable, context.filePath. You can do this by either entering it manually or using the auto-completion feature. In the Sheet list field, click on the “+” sign and enter the name of the sheet you are reading data from, enclosed in double quotes. If the Excel file contains column headers, enter 1 in the Header section. Next, click on the three dots to the right of the Edit schema and add the column names along with their respective data types. Depending on the format of the Excel file, you may need to enable the Read excel2007 file format (xlsx) option.
→ Add a tDBOutput component
Link this component to the tFileInputExcel using a Main row trigger.
The tDBOutput component transfers data from the Excel file into the database table named “Shop 1”. This table was previously created in the MySQL training schema. Make sure to use the existing database connection defined at the beginning by enabling the Use an existing connection option. The Action on the table should remain at its default setting, and the Action on data should be set to insert. Also, ensure that the schema is synchronized with the input. In the Table field, enclose the name of the table in double-quotes.
→ Logging Logic
If the data transfer is successful:
Add a tJava component
Link this component to tFileInputExcel using an OnSubJobOk trigger. This instructs Talend to follow this branch if the data transfer completes successfully.
Define the following context variables:
→ context.job_end_dt=TalendDate.formatDate(“yyyy-MM-dd HH:mm:ss”, TalendDate.getCurrentDate());
MySQL accepts this timestamp format and allows you to store the Job end time in the logging table. It’s typically set at the end of the flow.
→ context.job_duration= TalendDate.diffDate(TalendDate.parseDate(“yyyy-MM-dd HH:mm:ss”,context.job_end_dt),TalendDate.parseDate(“yyyy-MM-dd HH:mm:ss”, context.job_start_dt),”ss”);
This Talend expression calculates the time difference, in seconds, between two timestamps: context.job_end_dt and context.job_start_dt. It first converts these timestamp strings into date objects using the specified “yyyy-MM-dd HH:mm:ss” format, then finds the difference between them using diffDate, with “ss” indicating the result in seconds. This is useful for tracking how long a Job takes to run, aiding in performance monitoring and optimization.
→ context.row_count_exported= ((Integer)globalMap.get(“tDBOutput_2_NB_LINE”));
The system variable of the output component is used to retrieve the total number of records that have been transferred.
→ context.job_status = “Success”;
Since we are on the Success branch, we hard-code the status as ‘Success’ just before inserting it into the log table.
Add a tDBRow component
Link this component to the tJava one using an OnComponentOk trigger.
Make sure to enable the Use an existing connection option and select the one defined at the beginning of the flow. In the Query section, include the INSERT statement.
The INSERT statement is:
insert into job_log (job_name, purpose, schema_name, job_start_dt, job_end_dt, job_duration, job_status, row_count_exported)
values(
‘”+context.job_name +”‘,
‘”+context.purpose +”‘,
‘”+context.schema_name +”‘,
‘”+context.job_start_dt +”‘,
‘”+context.job_end_dt +”‘,
‘”+context.job_duration +”‘,
‘”+context.job_status +”‘,
‘”+context.row_count_exported +”‘
);
Add a tDBClose connection
Link this component to the tDBRow one using an OnComponentOk trigger. This way, we ensure that the connection is closed at the end of the flow and the data is committed.
→ If the data transfer is not successful:
Add a tJava component
Link this component to tFileInputExcel using an OnSubJobError trigger. This instructs Talend to follow this branch if an error occurs during data transfer.
Define the same context variables as described above, except for context.job_status, which is defined as follows:
context.job_status = ‘Failure’;
Add a tDBRow component.
Link this component to the tJava using an OnComponentOk trigger. The same configurations and query apply here as presented in the tDBRow component in the successful case.
Add a tDBClose connection
Link this component to the tDBRow one using an OnComponentOk trigger. This way, we ensure that the connection is closed at the end of the flow and the data is committed.
Ensure that the settings mentioned in steps 1-3 are applied.
Run the Job.
Verify that there are no errors at runtime and that the Job finishes with an Exit code of 0.
Optionally, check the shop1 and job_log tables in the training schema to confirm that data has been properly inserted.
Here is the result in Talend:
Watch the video tutorial here:
About btProvider:
We hold all possible technical and sales certifications for all products:
Talend, Tableau Desktop, Tableau Public, Tableau Server, Tableau Prep, Tableau Data Management, Vertica, Salesforce, Mulesoft, Write-Back.
With Talend – the best data management platform – you can always understand your data better. Using Talend, you will make intelligent and strategic decisions and seamlessly integrate, assure quality, and govern your data effortlessly. See other #skillpill video tutorials here:
Using measure names in parameter actions
Are you interested in learning more about Talend – the catalyst for efficient and comprehensive data management solutions across your company?
In this Skillpill, you will learn how to create a Parliament Chart in Tableau. A Parliament Chart, or semicircular chart, represents the distribution of parliamentary seats. It is not a standard chart type in Tableau but can be manually created […]
In Talend Cloud Management Console (TMC), you can create executable tasks and execution plans, which can be run directly in the Cloud or with a Remote Engine. You need to create an execution plan to allow automated runs in TMC, […]
In this Skillpill, you will learn how to use Tableau Prep to organize, clean, and transform your data to be further used in analysis and visualizations. Data comes in all shapes and sizes, especially of different quality, which sometimes proves […]