Questions tagged [machine-learning]

Ask Question

Machine learning algorithms build a model of the training data. The term "machine learning" is vaguely defined; it includes what is also called statistical learning, reinforcement learning, unsupervised learning, etc. ALWAYS ADD A MORE SPECIFIC TAG.

20,433 questions

0 votes

0 answers

6 views

When testing a specific hypothesis regarding HTE with "best_linear_projection" in a Causal Forest, is it valid to halve the p-value?

I’m using the "grf" package in R and its "best_linear_projection" function, which regresses doubly robust (AIPW) scores on a set of covariates/features. I have a directional ...

Jo99

asked 6 hours ago

0 votes

0 answers

25 views

What do I need to determine what the best model is to interpolate missing time series data?

I have a numerical time series with missing data points (rows). Someone said use a spline here, but that is old. I want a modern approach. AI recommends XGBRegressor or RandomForestRegressor. How do I ...

Michael

asked yesterday

0 votes

0 answers

14 views

Is it a bad idea to use Transformer models on long-tailed datasets?

I’m working on a video classification task with a long-tailed dataset where a few classes have many samples while most classes have very few. More specifically, my dataset has around 9k samples and 3....

Olivia

asked 2 days ago

4 votes

2 answers

293 views

Is it okay in prediction problems to put post-outcome features in the model?

I am relatively new to machine learning. I see many examples of practices where people include variables that are only available after the outcome variable (Y) to make predictions. An example of this ...

Abdullah Abdelaziz

asked Oct 31 at 18:22

1 vote

0 answers

20 views

Need advice on length of context for future prediction

I'm using a trained foundation model to forecast values on a time series. The model works by taking a window of recent data (context) to predict near-future outcomes (horizon). How can I know the ...

Michael

asked Oct 30 at 20:52

1 vote

0 answers

28 views

In ablation with a fixed algorithm and fixed hyperparameters, can the expected test risk increase when adding strictly informative features?

Goal (decision-theoretic) I want to know whether there exist conditions under which the EXPECTED test risk strictly increases when I add information to the input, while keeping the learning rule fixed....

Jacopo Mancini

asked Oct 29 at 11:49

0 votes

0 answers

64 views

Why is cross-validation better than data-splitting for small datasets? [closed]

In model building approaches, it is common practice that the entire data split into training and testing sets, and then, use the training set for building the model. However, I favor K cross-...

Rahul

asked Oct 27 at 21:10

0 votes

0 answers

19 views

Non-linear regression for modeling accuracy of ML models

Suppose I have a slow model with accuracy of between 75 and 80 %. I want to approximate this model with faster models. Fast models require $e$ effort and the more effort the better. I want to estimate ...

Gaslight Deceive Subvert

asked Oct 27 at 15:49

2 votes

0 answers

20 views

Mixed-effects random forest regression conditional variable permutation importance software implementations

Is there any existing open source software implementation of mixed effects random forest regression (for clustered data) that employs conditional inference decision trees as base learners, and enables ...

Mike

asked Oct 27 at 13:28

1 vote

2 answers

63 views

How can Kernel Density Estimation learn multiple classes?

So I've stumbled upon this example in the Sklearn website, where a KDE instance is trained with handwritten digits, and then used to synthesize samples : https://scikit-learn.org/stable/auto_examples/...

Polyval4

asked Oct 27 at 1:13

0 votes

0 answers

29 views

Transformation for clinical data with skewed distribution and a lot of zeros

I’m building a machine learning model using medical data. The features include clinical measurements (e.g., hemoglobin level in blood). The lab confirmed that some of these values can actually be ...

Marco Simoni

asked Oct 24 at 14:35

5 votes

1 answer

97 views

Is there a "better" approach when it comes to model evaluation on multiple test datasets?

I have two models trained and validated on the same training/validation data. Now I need to evaluate them on multiple independent test datasets (e.g., 10 different datasets of the same measure). Which ...

user26416177

asked Oct 23 at 16:28

-1 votes

0 answers

52 views

Does this method of unsupervised learning had already been tested? [closed]

So let's consider data is $n$ parameters and we consider a layer of also $n$ parameters we note $f$ the function $[0,1]^n \rightarrow [0,1]^n$ from first to second layer that we will consider ...

Guill Guill

asked Oct 19 at 16:52

0 votes

0 answers

49 views

How to apply Naive Bayes classifer when classes have different binary feature subsets?

I have a large number of classes $\mathcal{C} = \{c_1, c_2, \dots, c_k\}$, where each class $c$ contains an arbitrarily sized subset of features drawn from the full space of binary features $\mathbf{X}...

Special Sauce

asked Oct 12 at 3:08

15 30 50 per page

2 3 4 5

…

1363 Next

Stack Exchange Network

Questions tagged [machine-learning]

Popular research topics of the future [closed]

When testing a specific hypothesis regarding HTE with "best_linear_projection" in a Causal Forest, is it valid to halve the p-value?

What do I need to determine what the best model is to interpolate missing time series data?

Is it a bad idea to use Transformer models on long-tailed datasets?

Is it okay in prediction problems to put post-outcome features in the model?

Need advice on length of context for future prediction

In ablation with a fixed algorithm and fixed hyperparameters, can the expected test risk increase when adding strictly informative features?

Why is cross-validation better than data-splitting for small datasets? [closed]

Non-linear regression for modeling accuracy of ML models

Mixed-effects random forest regression conditional variable permutation importance software implementations

How can Kernel Density Estimation learn multiple classes?

Transformation for clinical data with skewed distribution and a lot of zeros

Is there a "better" approach when it comes to model evaluation on multiple test datasets?

Does this method of unsupervised learning had already been tested? [closed]

How to apply Naive Bayes classifer when classes have different binary feature subsets?

Hot Network Questions