PYTHON Pandas Re-Indexing

Original article was published by Furkan Gulsen on Artificial Intelligence on Medium


PYTHON Pandas Re-Indexing

Reindexing row labels and column labels changes a DataFrame. It means Reindex fits the data to match a specific set of tags along a specific axis. Multiple operations can be performed through indexing as follows:

  • Reorder existing data to match a new set of labels.
  • Add missing value (NA) markers to tag positions where the tag is not data.

Note: Let me briefly explain a few numpy features above.

  • tolist (): Returns a copy of the array data (nested) as a Python list. Data items are converted to the closest compatible built-in Python type via the element function.
  • linspace (): Returns numbers equally spaced in the specified range.
  • choice (): Creates a random sample from a particular 1-D sequence.

Reindex to Align with Other Objects

You may want to take an object and redefine it to label its axis the same as another object. Check out the example below to understand the same phenomenon.

Filling During Re-Indexing

reindex() takes an optional parameter method, which is a fill method with values as follows.

  • pad/ffill: Fill values forward.
  • bfill/backfill: Backfills values.
  • nearest: Populates from the nearest index values.

Filling Limits During Re-Indexing

The limit argument provides additional control over padding during reindexing. The limit specifies the maximum number of consecutive matches. To understand this phenomenon, consider the example below.

Renaming

The rename() method allows you to relabel an axis based on some matches (dict or series) or a random function. To understand this, consider the following example: