Python Tutorial: Concatenation

Python Tutorial: Concatenation

DataCamp

4 года назад

622 Просмотров

Want to learn more? Take the full course at https://learn.datacamp.com/courses/pandas-joins-for-spreadsheet-users at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.

---
In this video you'll learn how to join data with concatenation.

You may recognize the term concatenate from the 'concatenate' formula used in spreadsheets to combine text cells.
The concept in pandas is the same,
and when it's applied to data frames, it's like copying and pasting in a spreadsheet, or like gluing two pieces of wood together face-to-face.

In pandas we use the pandas-concat function to concatenate data frames. The function allows us to join two or more data frames along either rows or columns.

Concatenating along rows is very useful when working with split data. For instance, reports are often produced annually with each year saved to a separate tab or file.

You can vertically stack one or more data frames like these using pandas concat. The function will return a single data frame with data ordered by the order of the data frame names you supply, such as df1, df2, and so on.

Basic concatenation works best when each input data frame has separate values for the index, such as 'Game Key' in this example. That way the resulting frame will still have an index of unique values.
Also, it’s worth noting that the concat function includes all rows by default. In other words, it creates an 'outer join'.

Sometimes the input data frames have generic indexes that overlap, like the row numbers in a spreadsheet. This case often occurs when batch-loading data into data frames without specifying a named index.

Not to worry - concat has an optional parameter called ignore-index. You can set its value to true and let pandas generate a new uniquely numbered index for the concatenated data frame.

You can also use concat when combining complementary data.

Here you'll want to concatenate across columns, which is like pasting tables side by side.

You'll need to tell pandas to join by column. Pandas refers to moving down rows as axis=0 and across columns as axis=1.

By specifying axis=1 in the concat statement, we override the default behavior and join the columns.

It's worth noting that all columns from the data frames are included by default. You may need to drop or rename columns before continuing.

It's now time to concatenate your knowledge and practice on some data.

#Python #PythonTutorial #DataCamp #Concatenation #Pandas #Joins #Spreadsheet

Тэги:

#Spreadsheet #Joins #Pandas #Concatenation #DataCamp #PythonTutorial #Python
Ссылки и html тэги не поддерживаются


Комментарии: