Ten ways to create ‘pandas dataframes’

Jayanti prasad Ph.D
3 min readOct 1, 2019

--

Python pandas data frames are undoubtedly the favourite data objects of data scientists and data practitioners. I planned to write a long article about the 100 things one can do with the pandas data frames but found that over ambitious and time consuming so I will break that into multiple parts. This is the first part of the series and is focused on how to create python pandas data frames.

What are pandas data frames ?

Pandas data frames are rectangular shape objects (almost likes matrices) with number of rows and columns. All the entries of a column are of the same type. In general data frame elements can be integers, floating point numbers, strings or even dictionaries (this need special consideration in reading and writing). The most important property of a data frame is it ‘shape’ which represent the number of rows and the columns the data frame has.

How to get python pandas data frame ?

You must install python pandas in whatever way you want (pip install pandas is best). Once you have pandas installed you can get them with:

import pandas as pd

Now let us discuss how we can create pandas frames.

1. Empty pandas frame

>>> import pandas as pd>>> df=pd.DataFrame()>>> type(df)<class ‘pandas.core.frame.DataFrame’>>>>

2. Data frame with a list

>>> import pandas as pd>>> df=pd.DataFrame([1,2,3])>>> df00 11 22 3>>> df.shape(3, 1)>>> df=pd.DataFrame([[1,2,3],[4,5,6]])>>> df.shape(2, 3)>>> df0 1 20 1 2 31 4 5 6>>>

3. Data frame from a dictionary

>>> df=pd.DataFrame({‘Name’:[‘intel’,’apple’,’microsoft’,’nvidia’], ‘Product’: [‘hardware’,’hardware’,’software’,’hardware’]})>>> dfName Product0 intel hardware1 apple hardware2 microsoft software3 nvidia hardware>>>

4. Using loc

>>> df=pd.DataFrame(columns=[‘item’,’numbers’])>>> df.loc[0]=[‘Apple’,5]>>> df.loc[1]=[‘Orange’,7]>>> df.loc[2]=[‘Mango’,9]>>> dfitem numbers0 Apple 51 Orange 72 Mango 9>>>

5. From a two dimensional list

>>> data = [[‘Apple’,3], [‘Mango’,5], [‘Orange’,11]]>>> df = pd.DataFrame(data, columns=[‘item’,’number’])>>> dfitem number0 Apple 31 Mango 52 Orange 11>>>

6. From Pandas Series

>>> p = pd.Series([‘Apple’,’Mango’,’Orange’])>>> q = pd.Series([10,20,30])>>> df = pd.DataFrame({‘item’: p, ‘number’: q})>>> dfitem number0 Apple 101 Mango 202 Orange 30>>> p = pd.Series([‘Apple’,’Mango’,’Orange’], index=[‘a’,’b’,’c’])>>> q = pd.Series([10,20,30], index=[‘a’,’b’,’c’])>>> df = pd.DataFrame({‘item’: p, ‘number’: q})>>> dfitem numbera Apple 10b Mango 20c Orange 30>>>

7. From dictionary of dictionaries

>>> data = {‘item’: {‘a’: ‘apple’,’b’: ‘mango’, ‘c’: ‘orange’}, ‘number’: {‘a’: 10, ‘b’: 20, ‘c’: 30}}>>> data{‘item’: {‘a’: ‘apple’, ‘b’: ‘mango’, ‘c’: ‘orange’}, ‘number’: {‘a’: 10, ‘b’: 20, ‘c’: 30}}>>> df=pd.DataFrame(data)>>> dfitem numbera apple 10b mango 20c orange 30>>>

8. From a numpy array

>>> data = np.array([[1,1,1],[2,4,8]])>>> df=pd.DataFrame(data, columns=[‘same’,’square’,’cube’])>>> dfsame square cube0 1 1 11 2 4 8>>>

9. Using ‘from dict’ function

>>> data={‘name’: [‘apple’, ‘mango’,’orange’], ‘number’: [2,6,7]}>>> df=pd.DataFrame.from_dict(data)>>> dfname number0 apple 21 mango 62 orange 7>>>

10. From a csv file

Let us assume we have a ‘csv’ file with the following content

name, ageram, 10shyam, 20

We can read it into panda data frame

df = pd.read_csv(“name.csv”)

If you find the article useful & interesting please like and share and post below if you have any comments. I will add next part on python pandas data frame soon.

--

--

Jayanti prasad Ph.D
Jayanti prasad Ph.D

Written by Jayanti prasad Ph.D

Physicist, Data Scientist and Blogger.

No responses yet