Google BigQuery Connector for Azure Data Factory (Pipeline)
Read / write Google BigQuery data inside your app without coding using easy to use high performance API Connector
In this article you will learn how to quickly and efficiently integrate Google BigQuery data in Azure Data Factory (Pipeline) without coding. We will use high-performance Google BigQuery Connector to easily connect to Google BigQuery and then access the data inside Azure Data Factory (Pipeline).
Let's follow the steps below to see how we can accomplish that!
Google BigQuery Connector for Azure Data Factory (Pipeline) is based on ZappySys API Driver which is part of ODBC PowerPack. It is a collection of high-performance ODBC drivers that enable you to integrate data in SQL Server, SSIS, a programming language, or any other ODBC-compatible application. ODBC PowerPack supports various file formats, sources and destinations, including REST/SOAP API, SFTP/FTP, storage services, and plain files, to mention a few.
Create ODBC Data Source (DSN) based on ZappySys API Driver
Step-by-step instructions
To get data from Google BigQuery using Azure Data Factory (Pipeline) we first need to create a DSN (Data Source) which will access data from Google BigQuery. We will later be able to read data using Azure Data Factory (Pipeline). Perform these steps:
- 
	    Download and install ODBC PowerPack. 
- 
	Open ODBC Data Sources (x64):   
- 
	Create a User data source (User DSN) based on ZappySys API Driver ZappySys API Driver  - 
	                Create and use User DSN 
	                if the client application is run under a User Account.
                    This is an ideal option in design-time , when developing a solution, e.g. in Visual Studio 2019. Use it for both type of applications - 64-bit and 32-bit.
- 
	                Create and use System DSN 
	                if the client application is launched under a System Account, e.g. as a Windows Service.
                    Usually, this is an ideal option to use in a production environment . Use ODBC Data Source Administrator (32-bit), instead of 64-bit version, if Windows Service is a 32-bit application.
 Azure Data Factory (Pipeline) uses a Service Account, when a solution is deployed to production environment, therefore for production environment you have to create and use a System DSN.
- 
	                Create and use User DSN 
	                if the client application is run under a User Account.
                    This is an ideal option 
- 
	    When the Configuration window appears give your data source a name if you haven't done that already, then select "Google BigQuery" from the list of Popular Connectors. If "Google BigQuery" is not present in the list, then click "Search Online" and download it. Then set the path to the location where you downloaded it. Finally, click Continue >> to proceed with configuring the DSN: GoogleBigqueryDSNGoogle BigQuery  
- 
        Now it's time to configure the Connection Manager. Select Authentication Type, e.g. Token Authentication. Then select API Base URL (in most cases, the default one is the right one). More info is available in the Authentication section. Google BigQuery authenticationUser accounts represent a developer, administrator, or any other person who interacts with Google APIs and services. User accounts are managed as Google Accounts, either with Google Workspace or Cloud Identity. They can also be user accounts that are managed by a third-party identity provider and federated with Workforce Identity Federation. [API reference] Follow these steps on how to create Client Credentials (User Account principle) to authenticate and access BigQuery API in SSIS package or ODBC data source: WARNING: If you are planning to automate processes, we recommend that you use a Service Account authentication method. In case, you still need to use User Account, then make sure you use a system/generic account (e.g.automation@my-company.com). When you use a personal account which is tied to a specific employee profile and that employee leaves the company, the token may become invalid and any automated processes using that token will start to fail.Step-1: Create projectThis step is optional, if you already have a project in Google Cloud and can use it. However, if you don't, proceed with these simple steps to create one: - 
          First of all, go to Google API Console. 
- 
          Then click Select a project button and then click NEW PROJECT button:   
- 
          Name your project and click CREATE button:   
- 
          Wait until the project is created:   
- Done! Let's proceed to the next step.
 Step-2: Enable Google Cloud APIsIn this step we will enable BigQuery API and Cloud Resource Manager API: - 
          Select your project on the top bar:   
- 
          Then click the "hamburger" icon on the top left and access APIs & Services:   
- 
        Now let's enable several APIs by clicking ENABLE APIS AND SERVICES button:   
- 
        In the search bar search for bigquery apiand then locate and select BigQuery API:  
- 
        If BigQuery API is not enabled, enable it:   
- 
        Then repeat the step and enable Cloud Resource Manager API as well:   
- Done! Let's proceed to the next step.
 Step-3: Create OAuth application- 
        First of all, click the "hamburger" icon on the top left and then hit VIEW ALL PRODUCTS:   
- 
        Then access Google Auth Platform to start creating an OAuth application:   
- 
        Start by pressing GET STARTED button:   
- 
        Next, continue by filling in App name and User support email fields:   
- 
        Choose Internal option, if it's enabled, otherwise select External:   
- 
        Optional step if you used Internaloption in the previous step. Nevertheless, if you had to useExternaloption, then click ADD USERS to add a user:  
- 
        Then add your contact Email address:   
- 
        Finally, check the checkbox and click CREATE button:   
- Done! Let's create Client Credentials in the next step.
 Step-4: Create Client Credentials-         
        In Google Auth Platform, select Clients menu item and click CREATE CLIENT button:   
- 
        Choose Desktop appas Application type and name your credentials:  
- 
        Continue by opening the created credentials:   
- 
        Finally, copy Client ID and Client secret for the later step:   
-   
        Done! We have all the data needed for authentication, let's proceed to the last step! 
 Step-5: Configure connection- 
        Now go to SSIS package or ODBC data source and use previously copied values in User Account authentication configuration: - In the ClientId field paste the Client ID value.
- In the ClientSecret field paste the Client secret value.
 
- 
        Press Generate Token button to generate Access and Refresh Tokens. 
- 
        Then choose ProjectId from the drop down menu. 
- 
        Continue by choosing DatasetId from the drop down menu. 
- 
        Finally, click Test Connection to confirm the connection is working. 
- 
        Done! Now you are ready to use Google BigQuery Connector! 
 API Connection Manager configurationJust perform these simple steps to finish authentication configuration: - 
                            Set Authentication Type to User Account [OAuth]
- Optional step. Modify API Base URL if needed (in most cases default will work).
- Fill in all the required parameters and set optional parameters if needed.
- Press Generate Token button to generate the tokens.
- Finally, hit OK button:
 GoogleBigqueryDSNGoogle BigQueryUser Account [OAuth]https://www.googleapis.com/bigquery/v2Required Parameters UseCustomApp Fill-in the parameter... ProjectId (Choose after [Generate Token] clicked) Fill-in the parameter... DatasetId (Choose after [Generate Token] clicked and ProjectId selected) Fill-in the parameter... Optional Parameters ClientId ClientSecret Scope https://www.googleapis.com/auth/bigquery https://www.googleapis.com/auth/bigquery.insertdata https://www.googleapis.com/auth/cloud-platform https://www.googleapis.com/auth/cloud-platform.read-only https://www.googleapis.com/auth/devstorage.full_control https://www.googleapis.com/auth/devstorage.read_only https://www.googleapis.com/auth/devstorage.read_write RetryMode RetryWhenStatusCodeMatch RetryStatusCodeList 429|503 RetryCountMax 5 RetryMultiplyWaitTime True Job Location Redirect URL (Only for Web App)   Google BigQuery authenticationService accounts are accounts that do not represent a human user. They provide a way to manage authentication and authorization when a human is not directly involved, such as when an application needs to access Google Cloud resources. Service accounts are managed by IAM. [API reference] Follow these steps on how to create Service Account to authenticate and access BigQuery API in SSIS package or ODBC data source: Step-1: Create projectThis step is optional, if you already have a project in Google Cloud and can use it. However, if you don't, proceed with these simple steps to create one: - 
          First of all, go to Google API Console. 
- 
          Then click Select a project button and then click NEW PROJECT button:   
- 
          Name your project and click CREATE button:   
- 
          Wait until the project is created:   
- Done! Let's proceed to the next step.
 Step-2: Enable Google Cloud APIsIn this step we will enable BigQuery API and Cloud Resource Manager API: - 
          Select your project on the top bar:   
- 
          Then click the "hamburger" icon on the top left and access APIs & Services:   
- 
        Now let's enable several APIs by clicking ENABLE APIS AND SERVICES button:   
- 
        In the search bar search for bigquery apiand then locate and select BigQuery API:  
- 
        If BigQuery API is not enabled, enable it:   
- 
        Then repeat the step and enable Cloud Resource Manager API as well:   
- Done! Let's proceed to the next step and create a service account.
 Step-3: Create Service AccountUse the steps below to create a Service Account in Google Cloud: - 
        First of all, go to IAM & Admin in Google Cloud console:   
-         
        Once you do that, click Service Accounts on the left side and click CREATE SERVICE ACCOUNT button:   
- 
        Then name your service account and click CREATE AND CONTINUE button:   
- 
        Continue by clicking Select a role dropdown and start granting service account BigQuery Admin and Project Viewer roles:   
- 
        Find BigQuery group on the left and then click on BigQuery Admin role on the right:   
- 
        Then click ADD ANOTHER ROLE button, find Project group and select Viewer role:   
- 
        Finish adding roles by clicking CONTINUE button:  You can always add or modify permissions later in IAM & Admin. You can always add or modify permissions later in IAM & Admin.
- 
        Finally, in the last step, just click button DONE:   
- 
        Done! We are ready to add a Key to this service account in the next step. 
 Step-4: Add Key to Service AccountWe are ready to add a Key (JSON or P12 key file) to the created Service Account: - 
        In Service Accounts open newly created service account:   
- 
        Next, copy email address of your service account for the later step:   
- 
        Continue by selecting KEYS tab, then press ADD KEY dropdown, and click Create new key menu item:   
- 
        Finally, select JSON (Engine v19+) or P12 option and hit CREATE button:   
- Key file downloads into your machine. We have all the data needed for authentication, let's proceed to the last step!
 Step-5: Configure connection- 
        Now go to SSIS package or ODBC data source and configure these fields in Service Account authentication configuration: - In the Service Account Email field paste the service account Email address value you copied in the previous step.
- In the Service Account Private Key Path (i.e. *.json OR *.p12) field use downloaded certificate's file path.
 
- Done! Now you are ready to use Google BigQuery Connector!
 API Connection Manager configurationJust perform these simple steps to finish authentication configuration: - 
                            Set Authentication Type to Service Account (Using *.json OR *.p12 key file) [OAuth]
- Optional step. Modify API Base URL if needed (in most cases default will work).
- Fill in all the required parameters and set optional parameters if needed.
- Press Generate Token button to generate the tokens.
- Finally, hit OK button:
 GoogleBigqueryDSNGoogle BigQueryService Account (Using *.json OR *.p12 key file) [OAuth]https://www.googleapis.com/bigquery/v2Required Parameters Service Account Email Fill-in the parameter... Service Account Private Key Path (i.e. *.json OR *.p12) Fill-in the parameter... ProjectId Fill-in the parameter... DatasetId (Choose after ProjectId) Fill-in the parameter... Optional Parameters Scope https://www.googleapis.com/auth/bigquery https://www.googleapis.com/auth/bigquery.insertdata https://www.googleapis.com/auth/cloud-platform https://www.googleapis.com/auth/cloud-platform.read-only https://www.googleapis.com/auth/devstorage.full_control https://www.googleapis.com/auth/devstorage.read_only https://www.googleapis.com/auth/devstorage.read_write RetryMode RetryWhenStatusCodeMatch RetryStatusCodeList 429 RetryCountMax 5 RetryMultiplyWaitTime True Job Location Impersonate As (Enter Email Id)   
- 
          
- 
	Once the data source connection has been configured, it's time to configure the SQL query. Select the Preview tab and then click Query Builder button to configure the SQL query:  ZappySys API Driver - Google BigQueryRead / write Google BigQuery data inside your app without coding using easy to use high performance API ConnectorGoogleBigqueryDSN ZappySys API Driver - Google BigQueryRead / write Google BigQuery data inside your app without coding using easy to use high performance API ConnectorGoogleBigqueryDSN  
- 
	Start by selecting the Table or Endpoint you are interested in and then configure the parameters. This will generate a query that we will use in Azure Data Factory (Pipeline) to retrieve data from Google BigQuery. Hit OK button to use this query in the next step. #DirectSQL SELECT * FROM bigquery-public-data.samples.wikipedia LIMIT 1000 /* try your own dataset or Some FREE dataset like nyc-tlc.yellow.trips -- 3 parts ([Project.]Dataset.Table) */ Some parameters configured in this window will be passed to the Google BigQuery API, e.g. filtering parameters. It means that filtering will be done on the server side (instead of the client side), enabling you to get only the meaningful data Some parameters configured in this window will be passed to the Google BigQuery API, e.g. filtering parameters. It means that filtering will be done on the server side (instead of the client side), enabling you to get only the meaningful datamuch faster .
- 
	Now hit Preview Data button to preview the data using the generated SQL query. If you are satisfied with the result, use this query in Azure Data Factory (Pipeline):  ZappySys API Driver - Google BigQueryRead / write Google BigQuery data inside your app without coding using easy to use high performance API ConnectorGoogleBigqueryDSN ZappySys API Driver - Google BigQueryRead / write Google BigQuery data inside your app without coding using easy to use high performance API ConnectorGoogleBigqueryDSN#DirectSQL SELECT * FROM bigquery-public-data.samples.wikipedia LIMIT 1000 /* try your own dataset or Some FREE dataset like nyc-tlc.yellow.trips -- 3 parts ([Project.]Dataset.Table) */ You can also access data quickly from the tables dropdown by selecting <Select table>.A You can also access data quickly from the tables dropdown by selecting <Select table>.AWHEREclause,LIMITkeyword will be performed on the client side, meaning that thewhole result set will be retrieved from the Google BigQuery API first, and only then the filtering will be applied to the data. If possible, it is recommended to use parameters in Query Builder to filter the data on the server side (in Google BigQuery servers).
- 
    Click OK to finish creating the data source. 
Video Tutorial
Read data in Azure Data Factory (ADF) from ODBC datasource (Google BigQuery)
- 
        To start press New button:   
- 
        Select "Azure, Self-Hosted" option:   
- 
        Select "Self-Hosted" option:   
- 
        Set a name, we will use "OnPremisesRuntime":   
- 
        Download and install Microsoft Integration Runtime. 
- 
        Launch Integration Runtime and copy/paste Authentication Key from Integration Runtime configuration in Azure Portal:   
- 
        After finishing registering the Integration Runtime node, you should see a similar view:   
- 
        Go back to Azure Portal and finish adding new Integration Runtime. You should see it was successfully added:   
- 
    Go to Linked services section and create a new Linked service based on ODBC:   
- 
    Select "ODBC" service:   
- 
    Configure new ODBC service. Use the same DSN name we used in the previous step and copy it to Connection string box: GoogleBigqueryDSNDSN=GoogleBigqueryDSN  
- 
    For created ODBC service create ODBC-based dataset:   
- 
    Go to your pipeline and add Copy data connector into the flow. In Source section use OdbcDataset we created as a source dataset:   
- 
    Then go to Sink section and select a destination/sink dataset. In this example we use precreated AzureBlobStorageDataset which saves data into an Azure Blob:   
- 
    Finally, run the pipeline and see data being transferred from OdbcDataset to your destination dataset:   
Actions supported by Google BigQuery Connector
Learn how to perform common Google BigQuery actions directly in Azure Data Factory (Pipeline) with these how-to guides:
- [Dynamic Endpoint]
- Create Dataset
- Delete Dataset
- Delete Table
- Get Query Schema (From SQL)
- Get Table Schema
- Insert Table Data
- List Datasets
- List Projects
- List Tables
- Post Dynamic Endpoint
- Read Data using SQL Query -OR- Execute Script (i.e. CREATE, SELECT, INSERT, UPDATE, DELETE)
- Read Table Rows
- Make Generic API Request
- Make Generic API Request (Bulk Write)
Conclusion
In this article we showed you how to connect to Google BigQuery in Azure Data Factory (Pipeline) and integrate data without any coding, saving you time and effort.
We encourage you to download Google BigQuery Connector for Azure Data Factory (Pipeline) and see how easy it is to use it for yourself or your team.
If you have any questions, feel free to contact ZappySys support team. You can also open a live chat immediately by clicking on the chat icon below.
Download Google BigQuery Connector for Azure Data Factory (Pipeline) Documentation
 
                     
                 
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                         
                 
		             
		             
		             
		             
		             
		             
		             
		             
		             
		             
		             
		             
		             
		             
		             
		             
		            