Importing Data from a Text File
In the Factory, you can import directly data from a text file, defining the basic parsing options. You can do it by either:
Dragging the text file directly onto the Stage, automatically creating an Import from Text File task.
Dragging an Import from Text File task onto the Stage: this operation allows you to make a more precise import operation, as you can:
Import a single file, specifying the parsing and import options for the specific file;
Import multiple files together, concatenating them into a single table. This means that all files imported together must have the same structure.
A preview of the imported table(s) is always displayed.
Prerequisites
You must have created a flow;
If you are importing multiple files, they need to have the same structure.
Procedure
Drag an Import from Text File task onto the Stage.
Double click the task and open the task menu.
Choose from the drop down list if you want to import the file from your computer (Local) or from a remote connection.
If you are importing from a Remote Filesystem, choose it from the list and then click on the pencil button to set the connection information required (only if you are using a Custom source). The tables are loaded in the Files tab.
If you are using a Local Filesystem, click on the Select button and choose the path where the file is stored.
Click on the Add new path button, located next to the Select button, to add new paths where other files to be imported are stored. You can add as many paths as you want.
Click on the X button, located next to the Select button, to cancel the corresponding path.
Click on the Delete all paths button, located under the Add new path button, to cancel all the inserted paths.
Choose the resources to import by clicking the Select button in the Path 1 section.
Click on the Add new path button, located next to the Select button, to add a new path for a new resource. You can add as many paths as you want.
Click on the X button, located next to the Select button, to cancel the corresponding path.
Click on the Delete all paths button, located under the Add new path button, to cancel all the inserted paths.
Click on the Concatenation Type’s drop-down arrow to choose Inner or Outer concatenation.
Inner concatenation final table includes only attributes that exist in both tables.
Outer concatenation final table includes all the attributes, filling in any missing values if necessary.
Click on the Match Column by drop-down arrow and choose if you want to match them by their Name or by their Position.
Click on the Text Configuration tab and set the Parsing options and Import options, as shown in the table below.
Save and compute the task.
Parsing and Import options
Settings options | Description |
---|---|
Parsing options | Here you can set:
|
Import options |
|
Focus on the Case Sensitive checkbox We encourage you not to select the Case Sensitive checkbox, as it has a significant impact on the data analysis. If the Case Sensitive checkbox is selected, the number of distinct values in a column can increase, causing a slight difference in the data analysis. In fact, if we have two strings, It might cause consequences also on attributes, because if we want to apply a function to the 'Word'
and 'word'
, they will be considered as two distinct values. This means that, if you write a function valid for the string 'word'
, it won't be valid for the string 'Word'
too. $"Word"
attribute and we type $"word"
in the formula bar, an error occurs during the computation process.