What is Quandl’s Data Structure?
Quandl’s data comes in two formats: the first one being Time-series and the second one being Tables.
Have in mind that some datasets can be shown in both formats.
Time-series data refer to data taken over a period of time.
Time-series are commonly displayed using line graphs, where the X-axis represents dates and the Y-axis represents other numeric observations.
The Time-series format only contains numerical data which is indexed by one date field. This means that the data is sorted by dates. This allows users to retrieve the entire time-series or a particular part of it, also called a “slice”.
Most Quandl datasets are stored in the Time-series format as financial data mostly consist of dates and observations which fit the format nicely.
Tables don’t only contain numerical values.
They can contain various unsorted data types such as strings, numbers, dates, etc. They can also be filtered into different fields.
Quandl ID code
Each feed of data has a short ID, also known as a Quandl code.
- Federal Reserve Economic Data (time-series): FRED
- Core US Fundamentals (tables): SF1
Time-series data such as FRED is built-up of a large number of individual time-series where each of them has its own Quandl code appended to the main Quandl code, for example: US civilian unemployment rate: FRED/UNRATE.
On the other hand, Tables data such as SF1 contain one or more tables where each of them has its own Quandl code, for example: The Quandl code for the Core US Fundamentals table is SHARADAR/SF1.
When it comes to Data Analysis tools, Quandl directly supports API, R, Python, Excel, Ruby and more.
The full list can be found here: https://www.quandl.com/tools/full-list
What are the Pros and Cons of Quandl?
What are the differences between Free and Premium datasets?
As the devil is in the details, their free products come from various “credible and established” sources such as central banks, government agencies, organizations, etc.
On the other hand, their premium products come from data providers such as: AlgoSeek, Applied Research, Barchart, Brave New Coin, and CryptoCompare etc.
The full list can be seen here: https://www.quandl.com/publishers
Quandl premium datasets have a regular update time while there are no guaranteed update times for free datasets and they can usually take anywhere from a few hours to weeks, months or years (depending on the data itself).
Quandl’s premium data is curated and maintained by professional providers with decades of expertise under their belts. This all adds on the betterment of documentation, correctness, organization, uniqueness and structure of the data. Here we can see the supremacy of the premium data.
Premium data on Quandl can be purchased via subscriptions. With them you subscribe only to particular datasets that you need. This means that you only pay for what you need. Have in mind that there is no fee to use the platform itself or Quandl’s open data feeds.
Almost all premium datasets come with some free sample data. To see a free sample data, you need to log in to your Quandl account. After clicking on any premium data feed of your interest, the website will take you to the home page of that feed where you will be presented with some sample data.
It needs to be mentioned that every single free data feed has a transparent link that leads to its source. This means that we can verify the data and the source it comes from.
What are the types of Free Data in Quandl?
When it comes to free data, Quandl offers you a massive range from Prices & Volume, Fundamentals, Sentiment, Derived Metrics, National Statistics, and Other.
Let’s mention some of the free datasets:
- Wiki Continuous Futures – This dataset is built on top of raw data from ICE, CME, LIFFE, etc, and is curated by the Quandl community.
- Zillow Real Estate Data – This dataset is the leading real estate and rental marketplace.
- Federal Reserve Economic Data – This dataset includes things such as growth, employment, inflation, labor, manufacturing and many more US economic data.
How to create an account in Quandl?
Having a Quandl account is important as it allows you to:
- Access their API, libraries and tools.
- Download free and/or premium data in any format.
- Access their export and visualization tools.
- View relevant information about pricing and licensing for their premium data.
Let’s get you started with your Quandl account. Have in mind that it is free and doesn’t require any credit card information.
Let’s go over the steps:
- Go over to the Quandl website (https://www.quandl.com/) and click on the sign up button in the top right corner.
- A screen will open that requires your first and last name, as well as your purpose of using Quandl (business, academic or personal). So feel free to input those fields. After that, click on “Next”.
- Now, another screen will appear where you will input your email address and answer a question on “How will you be using this data?” After that, click on “Next”.
- The final screen of your signing up process will appear where you will be asked to create a password. After you have created your password and checked the terms of service box, click on “Create Account”.
- Now, a welcome screen will appear with your API key.
Wait Igor, what in the world is this API key?! You didn’t prepare us for that!
Calm down, be sure to copy your key and don’t forget to verify your email address. If you pressed “Continue” without copying the API key, it’s not an issue. You can find your key in the Account Details section.
Now let’s explain what an API key is!
API key stands for the Application Programming Interface key and is represented as a string of code that gets passed in by a computer to an application. The program or application then uses the API key to identify the user to a website.
The API key is a “secret” authentication token as well as a unique identifier.
Think of it as your personal secret “password” for collecting data from Quandl.
Example of Quandl usage in Excel
- Let’s get our Database. Press the “Get Data” button inside of your Quandl toolbar.
- After your API key gets verified (remember that?), we’ll select our Data. For this example, I’ll pick the European Stock Prices for FSE Frankfurt Stock Exchange.
3.Click the blue “Next” button and select your preferred stock. I’ll use the Zooplus (ZO1_X). Click “Next”.
4. After that, a new window will open that will ask you to specify what Columns will your dataset have. I selected the Date, Open, High, Low and Close columns. Click “Next”.
5.Now, we’ll select the filters that we want for our data. As I
don’t want to make the steps too complicated, I won’t apply any filters.
6.After clicking the “Next” button, you are given a Placement option aka where do you want your data to be placed at in your Excel sheet. When you specify your Excel Sheet and Cell, click the “Insert” button.
7.Voilà! Your data is ready! If it didn’t appear for you, don’t panic! Just press the “Refresh Sheet” button that is found in your Quandl toolbar.
For the next example, the steps are the same but the dataset will be different. I’ll show you a Corporate Annual Report data for the Facebook Inc. dataset:
The next example will be different. We’ll use the Quandl “Formulas” option to code what we need and want.
Let’s follow the steps:
1. Let’s suppose that we want to download the daily closing prices for two assets in the last 3 days (as shown in the picture). First we need to specify that we want the closing prices by writing the following string (=’your cell’&”/CLOSE”) in our formula Excel section. Be sure to do this for both assets. You can simply drag the cell with the written code to the other cell.
2. Now, let’s add our Start and End dates. For this I’ll use the “today()” formula in Excel. For the Start date we will write “today()-3” which means past 3 days and for the End date we will just use “today()”.
3.This step requires us to use the “qseries” formula which, in our example, looks like this “=QSERIES(D2:E2,B1:B2)” As you can see, the D2:E2 specifies our asset filter and the B1:B2 part specifies our Start and End date. After the formula is written, refresh the sheet as we did in our past example, and you’re done.
If you want to change stuff up, and look at the data for the past 3 years, in the Start date field write “=today()-365*3”. Be sure to play with Quandl so you get a hang of it!
How to extract Quandl data in Python?
For this part, I’ll be using the Jupyter Notebook from the Anaconda application for Data Science.
More info on how to set up your Jupyter Notebook: Preparing Your Python Coding Environment
Firstly, we need to install the Quandl package for Python. For this, use the command “pip install quandl”.
Let’s show a few examples:
Example 1: WTI Crude Oil prices
For this example, we want to get the WTI Crude Oil price from the United States department of Energy:
1. We need to import all the packages we’re going to use. For this example, I’ll import Quandl, NumPy and Pandas. Note that you need to have NumPy and Pandas installed. They come pre-installed with the Anaconda program and the codes go like this: “pip install NumPy” and “pip install Pandas”.
2. After that, we need to activate our API key. Use the following command: quandl.ApiConfig.api_key = “YOUR_API_KEY”
3. The next step is to use the following command “mydata = quandl.get(“EIA/PET_RWTC_D”)”.
It is of utmost importance to know the Quandl code (“EIA/PET_RWTC_D”) of each dataset you want to use.
You can find the Quandl code for each dataset under the documentation page of each dataset. Example: https://www.quandl.com/data/EIA-U-S-Energy-Information-Administration-Data/documentation
If you want to get the same data, but as a NumPy array, you can use this code: mydata = quandl.get(“EIA/PET_RWTC_D”, returns=”numpy”)
1. In order to set specific Start and End dates, we shall write the following: mydata = quandl.get(“FRED/GDP”, start_date=”2019-12-31″, end_date=”2020-9-26″)
2. This is where Pandas come into play, as we want to save our extracted data to a .csv aka Excel format.
We will convert our data into a Pandas DataFrame using the following command: “df = pd.DataFrame(mydata, columns= [‘Date’,’Value’])”
3. After that, we will export our data and give the .csv file a name: “df.to_csv (‘Name.csv’)”
Example 3: Facebook Annual Report Data
Let’s download a Corporate Annual Report for Facebook. The steps are the same as in our first Python example, but with a little twist. Our Quandl code for this one is “WIKI/FB”.
We specify the years we want to go through and get the data we’ll need to preprocess it. This means that we need to extract only the annual part of it.
For this, we shall use the following command: mydata = quandl.get(“WIKI/FB”, collapse=”annual”)
Now our data looks like this:
Now you are all equipped to go on a Quandl coding journey in Python!
“May the code be with you!” – Obi Wan Kenobi
Quandl in other programming languages
Quandl is also equipped for other programming languages like R, MATLAB, Ruby and others.
In this article, we won’t go into them as it would make it too long. But R, Ruby and MATLAB enthusiasts worry not, as I’ll provide you with useful links where you can learn how to use Quandl in your programs!
MATLAB – https://github.com/quandl/Matlab
What are the limitations for Quandl downloads?
Quandl needs to limit their users when it comes to downloading so the API doesn’t get over flooded and the whole service crashing down.
Anonymous users are limited to 20 calls per 10 minutes and 50 calls per day; while authenticated users have a limit of 300 calls per 10 seconds, 2k calls per 10 minutes and 50k calls per day.
When it comes to the free authenticated users, they have a concurrency limit of one which means that they can only make one call at a time and have an additional call waiting in the queue.
As for the premium users, they have a limit of 5k calls per 10 minutes and a limit of 720k calls per day. They are also limited to 10 bulk downloads per hour.
What are some Quandl alternatives?
When it comes to price and corporate fundamentals, there are the following: Yahoo finance, IEX, Alpha Vantage, Polygon, TradingView…
More info: Google Finance API and 9 Alternatives
When it comes to Alternative data, there are the following: M Science, Advan, Thinknum, YipitData, Tegus, Arch Metrics and many more.
Here is the full list – https://alternativedata.org/data-providers/
More on Quandl’s background
According to Quandl, they have over 400,000 users. These ranges from the world’s top hedge funds to investment banks and various asset managers.
Quandl’s specialization is in the field of data that gets collected over time and is categorized as economic, social and financial data – this includes things like: stock prices, crime rates, unemployment rates, various populations, etc.
They manage to do this by indexing millions of numerical datasets from around the internet. For example, if you click on a specific dataset that is indexed in Quandl, it will go to its original dataset source, extract the newest version of it, clean it, and give it back to you in whatever form you want.
Quandl is striving for the democratization of data and making it easier to find and use for all its users. Their goal is to provide their service to various types of users and professions.
Can you believe that they even offer data pertained to some basic sport statistics?!
Ok Igor, that all sounds sweet, but how does Quandl acquire data?
Well, Quandl acquires its data by having various human curators that point their bots at specific datasets. Recently, they’ve been looking for more curators that are similar to Wikipedia’s volunteer editors.
You can also become a vendor and sell your data to Quandl. For this, the technical requirements aren’t strict but you need to create, document and maintain the data yourself with a “professional standard of quality”.
The good thing is that the listing of your data to Quandl doesn’t come with any fees. You also get support from the Quandl team for free and your data is viewed as your intellectual property.
As a Quandl vendor, you have full transparency of who’s buying/accessing your data. You are also provided with a vendor dashboard that shows you all the sales, subscribers, revenues and other information in real-time.