In this blog you will learn how to automatically fetch Transaction history for your Paypal account daily and insert it into a database using Python and Azure.
APIs (Application Programming Interfaces) play a big role in today’s world. It allows us to share data and practical business functionalities between devices, applications, and individuals. Although we may not notice them, APIs really are everywhere, powering our lives from behind the scenes. Most common example would be travel bookings. Travel apps uses third party APIs to integrate data from providers and compare those to give you the best options available. Likewise, Paypal has its own set of APIs available for its customers and business owners to integrate it in different ways into their businesses. In this tutorial we will be using the Transaction API provided by paypal to fetch history of transaction into a SQL database. Analysing these historical transactions can be very useful for business in today’s world.
You will need to go through some basic steps in order to get started. In a nutshell do these things before you proceed
- Create a PayPal account.
- Create a REST API app.
- Get your API credentials.
- Create sandbox accounts for testing.
A detailed guide to these steps can be found here
Few steps to add if you just created sandbox account:
- Create a few personal accounts and business accounts at step 4 mentioned above. (Login into paypal using these credentials at sandbox url and not the regular one. There are two different links to login- first is to login into your developer account and second is for logging into sandbox test account and making mock test transactions between different accounts)
- Then make a few transactions from these accounts so that you get some information while calling the transaction endpoints.
- Then go to this link and follow the steps to ensure that your app has the permission to access the transactions of your business account.
By now you have got your API credentials- your client ID and secret.
Next step is to exchange your API credentials for an access token. Your access token authorizes you to use the PayPal REST API server. To call a REST API in your integration, exchange your client ID and secret for an access token.
To check whether you have got the above steps right you can make a POST call to the API to get an access token. You can do this in two ways- 1) By using curl command (easy and quick) 2) Postman app. This link will explain you the steps to do so. A single access token will be valid for about 9 hours.
Now let us test if we are getting a json response when we call the Transaction API using this access token.
We will use postman for this.
- The URL to the API service is either:
Sandbox. https://api-m.sandbox.paypal.com
Live. https://api-m.paypal.com
- Fill in rest of the details in query parameters as shown. There are a number of optional parameters and headers which you may require for other APIs but not for this one. You can find those details here.
- Select Authorization as Bearer Token and put the access token from the previous step in this field. Go to headers tab and put in key as Content-Type and value as application/json. And in body tab select x-www-form-urlencoded to encode your request.
- Hit on Send and you should get a json response in the response tab below.
Details of the response and tables needed to store:
- paypal_count – This table stores the number of entries that were fetched in a single call. It has 3 columns- start date, end date and count.
- paypal_transaction_info – This table stores the transaction details. The fields in this table can vary depending upon the fields you require. The list of transaction fields that an API call returns is listed here. Some of the fields are nested and contain array of objects. Look for the structure of the object in the documentation and then parse the response. For example, all the amount fields in this response is a money object containing currency code and value inside it. And certain fields return an array of objects. So, one has to create separate tables to store these objects in the array to preserve the 1st Normal from of the relational database. Look in the documentation if the object is an array or not.
Example:
- paypal_cart – This table stores the item detail associated with each transaction. This is a nested array object inside the response. The structure goes like this – transaction_details(array) -> transaction_detail(object) -> cart_info(object) -> item_details(array)-> item_detail(object)
Look for the structure of the object in the documentation before coding.
You can see the sample response here on the right side of the page.
Now that we have verified that our calls are responding and understood the response structure, we now go on to do this process using Python.
The code and queries are available in this repository.
Algorithm part 1:
- Create a database on your SQL server. (Use SQL Server Management Studio (SSMS) for better management.)
- Create 3 tables- paypal_daily_count, paypal_cart and paypal_transaction_info. (Create queries are available on this link)
- Specify start date and end date in the request url (https://api.paypal.com/v1/reporting/transactions. Endpoints differ for sandbox accounts. You also have to include bearer token in the headers if you have it)
- Specify other parameters in the url (separated by ‘&’) depending upon your requirement. You can see the list of parameters that you can pass over here)
- Now hit this url using requests library in python.
- If it returns an invalid_token error then request for token. Send a POST request using your Client ID as username and Secret key as password (that you got after creating the app on paypal) on this endpoint- https://api.paypal.com/v1/oauth2/token?grant_type=client_credentials (Your credentials should be a Base64 encoded string appended by the word Basic in the header of the request)
Tip- You can directly get Python code of the above steps (3-6) using Postman.
- Get the token and repeat step 3-5.
- The response will be in Json format. Parse this response using the predefined keys available in the documentation and insert it into the database.
The above-mentioned steps show how to fetch data into your database using python.
Now we will see how we can automate this process using these steps (Steps 1-4 is similar as above):
- Request a bearer token.
- Get the greatest transaction_updated_date from the database and use it as a start_date.
- Assign “span” variable a number greater than 0 and less than or equal to 31 to calculate the end_date until when you require the transaction history.
- Send the request to the endpoint with parameters in the url.
- Parse the response then-
- Insert start_date, end_date and count in the paypal_count table.
- Parse the fields that you need from the response and insert the transaction details in the transaction table. (transaction id and transaction_updated_date are together the primary key). There are some nested arrays in each element of transaction_detail. See if you need those fields. In this tutorial we will be parsing array of objects in cart_info from each element of transaction_detail object.
- Parse item_details array under transaction_details.
- Look for the link to next page in the header section and store the total pages in the response in a variable.
- Loop through the pages in the link while the current page is not equal to the total_page variable.
- Parse the response for each page and repeat step 5.
The entire code is available here.
Now that we have completed the coding part, we will deploy our code in Azure functions and generate a trigger to automatically fetch records from paypal on a daily basis or weekly basis in a separate blog.
Important tips/notes for Paypal API:
- A transaction ID is not unique in the reporting system. The response can list two transactions with the same ID. One transaction can be balance affecting while the other is non-balance affecting.
- The transaction_updated_date along with transaction ID can form a unique primary key for each table.
- Use fields=all parameter while querying to get additional hidden fields in the response while calling transaction API
- It takes a maximum of three hours for executed transactions to appear in the list transactions call. So, set up your trigger accordingly.
- The maximum page size is 500 and maximum pages are 20 which gives us a maximum of 10,000 records in one call. If the account has more than 10,000 records for a specified date range, shorten the date range else you get an error.
- The Paypal Transaction API always returns date and time according to UTC. Make suitable conversions in your database or in the python code using the library timedelta and datetime.
Mahadevan Iyer
Data Engineer
Addend Analytics.