28,184 questions
Tooling
0
votes
0
replies
37
views
The Most Efficient Way for Random Forest Models Serialization Memory-Wise
I am training sklearn.ensemble.RandomForestRegressor on 272k rows of data with 38 features.
{
"n_estimators": 200,
"n_jobs": -1,
"random_state": 42
"...
Score of 0
0 answers
63 views
Optimize SGD weights convergence for frame differencing features [closed]
I am developing a real-time sports analytics system in Python 3.12. The pipeline decodes a video stream frame-by-frame to classify whether a ball is "IN" or "OUT" relative to a ...
Score of 3
1 answer
61 views
How to use StandardScaler on only specific columns in a pandas DataFrame? [duplicate]
I have a DataFrame with three columns: 'power', 'time', and 'status'. The 'status' column contains 0 or 1 (target variable). I don't want to scale the 'status' column.
How can I apply StandardScaler ...
Score of 0
1 answer
99 views
PyCaret 3.4.0 and scikit-learn return different results
I am currently working with PyCaret 3.4.0, since 4.0 lacks some configuration parameters that are useful for my case.
I tried to replicate PyCaret results using scikit-learn.
This is my script, after ...
Best practices
0
votes
1
replies
56
views
Rationale for StandardScaler over MinMaxScaler in spatiotemporal tree-based ensemble models with SHAP interpretability
I am developing a spatiotemporal tree-based ensemble framework (utilizing LightGBM, XGBoost, and CatBoost) to forecast dengue outbreaks based on climate variables (temperature, precipitation, humidity)...
Score of 0
1 answer
79 views
uncommon "setting an array with a sequence" error
im struggle with an error when creating MRE.
why these code :
from pandas import DataFrame as df
import numpy as np
from sklearn.neighbors import NearestNeighbors as nrb
from sklearn.decomposition ...
Advice
1
vote
2
replies
115
views
What is the difference between SciPy's KNN and Scikit-Learn's KNN?
I would like to understand the difference between the KNN algorithm as implemented in scikit-learn (for example, using the kdtree algorithm)
https://scikit-learn.org/stable/modules/generated/sklearn....
Score of 2
1 answer
69 views
ValueError: pos_label=1 is not a valid label: It should be one of [0]
When executing this code:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import LearningCurveDisplay, ...
Advice
3
votes
4
replies
186
views
How Do Machine Learning Algorithms Improve Accuracy Over Time?
I recently started learning Python and machine learning. I noticed that many developers use libraries like TensorFlow and Scikit-learn for predictive modeling. What are the main advantages of using ...
Score of 1
1 answer
104 views
KDTree 2d index searching based on pivot_table vector error
i want find 2d index(from PCA). based on pivot_table vector at my custom metric. fastly. that's why im using KDTree. but because it, these error is occur:
ValueError Traceback (most recent call last)
...
Advice
0
votes
5
replies
191
views
What are the key libraries and documentation for implementing anomaly detection in Python?
I am starting a role in industry where I will be working on anomaly detection using machine learning, particularly for data analysis tasks.
I would like to understand which tools and libraries are ...
Score of 0
1 answer
63 views
How to make a 2 Label Confusion Matrix and exporting into a json file?
I have to train a convolutional neural network on a dataset. The NN itself works and does what it's supposed to but now I want to make a confusion matrix and export it into a json file for further ...
Score of 2
1 answer
93 views
Sklearn Pipelines, adding features and column transformers
I'm just trying out/experimenting with sklearn. I'm using the California housing dataset, and I'm trying to make a pipeline to create some additional features, then take the logarithm of some features,...
Advice
0
votes
4
replies
86
views
How to cluster data based on a single value in python?
I have a object data stored in a JSON:
One Drive link to json
These represent markers which I am placing on a 2D map (the lat/lng in the file are YX positions on the map).
In reality the 3D objects ...
Advice
0
votes
2
replies
65
views
Generic sklearn template
I am creating a reusable scikit-learn pipeline for tabular data with numeric and categorical columns.
I want to:
Impute missing numeric values with the median
Scale numeric columns
Impute ...