Pandas and Numpy

Numpy and Pandas are simply python data structures just as dictionaries and lists.

NumPy stands for ‘Numerical Python’. It is an open-source library of Python which enables fast and high-performance mathematical computation on arrays and matrices. To use NumPy you need to import it.

>>> import numpy as np

NumPy is the best option to use with multi-dimensional arrays. In NumPy, dimensions are called axes. The number of axes is called the rank. Below is a NumPy array called data and the operations that can be performed.

Numpy are faster than python lists as well as consuming fewer memory. Its key data structure is an array.

We can also reshape the array for example to one dimension of 6 rows and 1 column.

Pandas

pandas

Besides the animals, pandas in python stand for Python Data Analysis Library.

Pandas is used to manipulate tabular data. its main data structure is a data frame.

Data frames are similar to SQL tables or the spreadsheets that you work within Excel and are faster and easier to use plus they exist in python environment.to use pandas in python we need to first import it,

import pandas as pd

A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular way in rows and columns.it allows you to load your dataset from say a CSV file, SQL database, or excel file even an array, for example, below is an array of integers with 5 columns and 20 rows

Then below is the same data turned into a table.

Data frames can be created from,

  • Python dictionaries
  • Python lists
  • Two-dimensional NumPy arrays
  • Files

Each column of a Pandas DataFrame is an instance of pandas.Series each having a label as a column header which you can use to access it.

With data frames, you can,

  • Retrieve and modify row and column labels as sequences
  • Represent data as NumPy arrays
  • Check and adjust the data types
  • Analyze the size of DataFrame objects

--

--