Newest 'dataframe' Questions

Score of 1

2 answers

121 views

df.explode() has no effect when used in a for-loop

I want to call multiple pandas-functions on multiple DataFrames with no avail. Unfortunately I do not understand why since I think it should work as pandas is referencing the DataFrames. Below is s ...

clme

reputation score 63

asked Jun 25 at 10:58

Score of 2

1 answer

121 views

Unexpected duplicates after pd.pivot_table [closed]

I have a data set that looks like this: >>> df_anon.head() a b c d 0 1 30 1 929.3453 1 1 30 3 875.3986 2 1 30 5 849.9972 3 1 51 1 571.8364 4 1 51 2 ...

nestor556

reputation score 486

asked Jun 16 at 18:35

Best practices

0 votes

6 replies

107 views

How to detect a specific sequence of string values within a Dataframe column

I am performing analysis of some log files with Python/Pandas, and I am trying to develop an efficient operation to find a specific sequence of string values within a column. My current idea was to ...

GandalfDG

reputation score 31

asked Jun 16 at 15:46

Score of 0

1 answer

112 views

Assign category in new column based on multiple parameters [duplicate]

Using the palmer penguins dataset and the test data df_query below, I need to write a function or set of commands to add a new column that assigned each penguin the CATEGORY of small, medium or large. ...

shu251

reputation score 289

asked Jun 12 at 17:49

Score of 1

0 answers

105 views

Formatting data matrix for Jaccard Similarity

I'm trying to do a Jaccard Similarity on my presence/absence data in RStudio, but I get this error. jaccard_dist <- vegdist(dat2, method = "jaccard", binary = TRUE) Error in vegdist(dat2, ...

fis323

reputation score 15

asked Jun 10 at 23:01

Score of -8

2 answers

172 views

python dataframe rolling by date to concatenate a string

In python dataframe rolling by date to get sum and concatinate a string Furas demonstrated how to concatenate a string, transaction id, from a rolling group by. This does not solve the problem. I ...

KBD

reputation score 153

asked Jun 9 at 15:38

Advice

1 vote

6 replies

137 views

Replacing xls file input with txt input

I have some code suitable for .xls file handling. But the input files are not consistent. If I use .txt files as input the problem maybe solved. I need some sample code for the same functionality. ...

prashanth manohar

reputation score 680

asked Jun 7 at 11:58

Score of -1

1 answer

165 views

python dataframe rolling by date to get sum and concatinate a string

I need to sum up transaction amounts over a rolling period and concat the transaction ids that make up the total. Full code is at the bottom. The below statements return the expected results. ...

KBD

reputation score 153

asked Jun 5 at 16:36

Score of 0

1 answer

124 views

DolphinDB Python API: __DolphinDB_Type__ triggers Pandas UserWarning

I'm using the DolphinDB Python API to upload a pandas DataFrame and want to control the DolphinDB column types, for example trade_time as DATETIME instead of the default STRING. Here's what I'm doing: ...

Olivia

reputation score 29

asked Jun 4 at 1:15

Best practices

1 vote

5 replies

121 views

How to avoid iterrows() for string concatenation and One-Hot Encoding in Pandas?

I am a university freshman learning AI, and we are currently working on a Kaggle dataset. I need to concatenate two string columns (ColA and ColB) and then convert the result into One-Hot Encoding. ...

Xuan

reputation score 1

asked Jun 3 at 15:24

Advice

1 vote

2 replies

63 views

How do I look into https://raw.githubusercontent.com/python-visualization/folium/master/examples/data to see available data?

I am learning how to create maps using python and a lot of the examples I learn from use https://raw.githubusercontent.com/python-visualization/folium/master/examples/data as an example dataset. ...

Jose Hurtado

reputation score 11

asked Jun 3 at 0:39

Score of -5

1 answer

186 views

python dataframe rolling by date to get sum

I need to accumulate the sum of transaction amount looking back a set number of days. 10 days would be a start. I have petty cash transaction for a couple of people and I want to sum their spending ...

KBD

reputation score 153

asked Jun 2 at 17:53

Score of 3

2 answers

229 views

How to deduplicate (based on two identical columns) and merge the remaining columns into a single row for a large dataframe?

Example data: ID<-c("A","A","A","A","A","A","B","B","B") HFAdmission<-c("2020-01-01", "...

Simran Parmar

reputation score 83

asked May 29 at 21:53

Score of 3

2 answers

144 views

Why is the second "over" needed?

Taking the data from this question add a new column based on a group without grouping df = pl.DataFrame({ 'year': [ 5, 5, 5, 10, 10, 15, 15, 30, 30, 30 ], ...

rhug123

reputation score 9066

asked May 26 at 21:55

Score of -2

1 answer

125 views

Optimize a Python Polars function to avoid counting IDs or stacking empty dataframes

I have a common pattern in my workflows where I have a 'primary' dataframe which I may need to subset to a portion of rows, update values and potentially add new columns, and then merge those subset ...

sbj

reputation score 37

asked May 21 at 21:22

Collectives™ on Stack Overflow

df.explode() has no effect when used in a for-loop

Unexpected duplicates after pd.pivot_table [closed]

How to detect a specific sequence of string values within a Dataframe column

Assign category in new column based on multiple parameters [duplicate]

Formatting data matrix for Jaccard Similarity

python dataframe rolling by date to concatenate a string

Replacing xls file input with txt input

python dataframe rolling by date to get sum and concatinate a string

DolphinDB Python API: __DolphinDB_Type__ triggers Pandas UserWarning

How to avoid iterrows() for string concatenation and One-Hot Encoding in Pandas?

How do I look into https://raw.githubusercontent.com/python-visualization/folium/master/examples/data to see available data?

python dataframe rolling by date to get sum

How to deduplicate (based on two identical columns) and merge the remaining columns into a single row for a large dataframe?

Why is the second "over" needed?

Optimize a Python Polars function to avoid counting IDs or stacking empty dataframes

Hot Network Questions