(2.1) Azure Data Factory
This is part of a series of articles called Azure Challenges. You can refer to the Intro Page to understand more about how the challenges work.
To start, please download the file below on Kaggle.
Select Create a Resource
Search for Data Factory
Add the info about your Data Factory resource
Select Review + create to start the deployment process
After confirming all the information you can hit Create
When the deployment is complete you can hit Go to resource
Now you have your Data Factory resource up and running
Now let's select your storage account
inside your storage account, you can select the Storage Explorer option to visualize all the files available in your storage account.
After that, you can use the Upload option to add the new file Twitter_Data.csv to your storage account
You can see the column ACCESS TIER that indicates which access tier is set up for your files. You can also change them if required.
Now let’s go back to the Azure Portal Home and choose your Data Factory resource.
Let’s open the Data Factory Studio option
Inside the Data Factory Studio let's choose the Ingest option.
Note that you have a bunch of different options to choose from the Data Factory Studio:
- Ingest
- Orchestrate
- Transform data
- Configure SSIS
Because our main goal for this example is to create a copy and ingest the data in a SQL database we are using the option Built-in copy task.
now choose the data source
Connection info
Select the option Browse and find the file inside the container (in my case) called my-container. Hit Next >
Data Factory was automatically able to identify the type of the file (.csv) comma-separated values. Hit Next >
Now we have all the info about the Source. The next step is to add the info about the destination (Target)
Choose what kind of connection you will create (in this case it will be Azure SQL Database)
Enter the information about your Azure SQL Database (the same info you used to create it) and hit Create.
Define the table name to be created in you Azure SQL Database with the new data.
Define column mapping
Define a name for your copy pipeline task.
Confirm all the data for your Copy Pipeline Task and hit Next >.
After you confirm your process will start. Congratulations! your data was moved from your Azure Blob Storage to your Azure SQL Database.
Now, just hit Finish.
Next step (2.2) Using Query Editor to visualize your data