Source · Community

r/MachineLearning

Reddit · COMMUNITY

Last updated May 7, 2026, 4:30 PM

Stop letting LLMs edit your .bib [D]

Research community reports frequent LLM hallucinations in bibliography generation, with incorrect author attributions despite correct titles, raising integrity concerns.

u/Pure-Ad9079·1 day ago·40 pts / 10 comm

r/MachineLearning· COMMUNITY

NeurIPS Submission Number [D]

Reddit discussion about NeurIPS submission volume potentially exceeding 40k submissions.

u/StriderKing27·2 days ago·30 pts / 15 comm

r/MachineLearning· COMMUNITY

Production AI very different from the demos [D]

Production AI deployment reveals hidden cost scaling: token usage doubled after adding retrieval context, pushing teams from GPT-4o toward cheaper alternatives.

u/Far-Football3763·2 days ago·33 pts / 11 comm

r/MachineLearning· COMMUNITY

Struggling to reproduce paper results before improving them — stuck below reported accuracy [R]

PhD student reports 4% accuracy gap when reproducing computer vision paper baseline; raises reproducibility concerns common in published ML research.

u/Plane_Stick8394·2 days ago·34 pts / 23 comm

r/MachineLearning· COMMUNITY

Is there a notable increase in demand for privacy-preserving AI/ML with the advent of LLMs? [D]

Reddit discussion on growing demand for privacy-preserving ML techniques amid LLM proliferation and de-anonymization research.

u/badcryptobitch·3 days ago·30 pts / 28 comm

r/MachineLearning· COMMUNITY

Why SSMs struggle in parameter-constrained training: empirical findings at 25M parameters [R]

After \~3 weeks of experimentation in OpenAI's Parameter Golf competition, I wrote up why SSMs are structurally disadvantaged relative to transformers in a time- and size-constrained regime (10 min training, 16MB artifact, 25M parameters) on 8xH100s: [https://mradassaad.github.io/posts/why-ssms-struggle-in-parameter-golf/](https://mradassaad.github.io/posts/why-ssms-struggle-in-parameter-golf/) Main findings: 1. SSM in\_proj weights compress up to 3.26x worse than attention QKV under LZMA, directly taxing the compressed parameter budget 2. Architectural wins validated at SP4096 flipped sign...

u/mradassaad·3 days ago·30 pts / 6 comm

r/MachineLearning· COMMUNITY

Are modern ML PhDs becoming too incremental, or is this just what research looks like now? [D]

Reddit discussion questioning whether modern ML PhDs prioritize incremental improvements over fundamental breakthroughs.

u/Hope999991·4 days ago·35 pts / 12 comm

r/MachineLearning· COMMUNITY

Thoughts on independent researcher affiliation? [D]

Reddit discussion on whether independent researcher affiliation hurts paper credibility at top venues.

u/Pure-Ad9079·4 days ago·31 pts / 20 comm

r/MachineLearning· COMMUNITY

ICML final decisions rant [D]

Reddit discussion critiquing ICML's 27% acceptance rate and review quality issues, raising concerns about paper triage cascading to NeurIPS.

u/CategoryNormal149·6 days ago·32 pts / 23 comm

r/MachineLearning· COMMUNITY

I spent years building a 103B-token Usenet corpus (1980–2013) and finally documented it [P]

For the past several years I've been quietly assembling and processing what I believe is one of the larger privately held pretraining corpora around... a complete Usenet archive spanning 1980 to 2013. Here's what it ended up being: * **103.1 billion tokens** (cl100k\_base) * **408 million posts** across 9 newsgroup hierarchies * **18,347 newsgroups** covered * **33 years** of continuous coverage The processing pipeline included full deduplication, binary removal (alt.binaries.\* excluded at the hierarchy level before record-level cleaning), quoted text handling, email address redaction via...

u/OwnerByDane·6 days ago·36 pts / 7 comm

r/MachineLearning· COMMUNITY

Why ML conference reviews sometimes feel like a “lottery“ [D]

Reddit discussion on ML conference peer review variability: strong papers consistently accept, weak papers reject, middle-tier papers vulnerable to reviewer mismatch and capacity constraints.

u/Hope999991·6 days ago·30 pts / 20 comm

r/MachineLearning· COMMUNITY

[ECCV 2026] Review Discussion [D]

ECCV reviews should be out by 2nd May. Since no exact time was specified this year, they’ll likely be released sometime within the next 48 hours. Hopefully, the reviews go well for everyone. We can use this thread to discuss them, as I haven’t seen one started yet.

u/NGK12·6 days ago·36 pts / 7 comm

r/MachineLearning· COMMUNITY

Is it just me or is the Conference Lottery culture killing research? [D]

Reddit discussion on conference submission pressure and burnout in academic ML research culture.

u/SillyNeuron·6 days ago·34 pts / 11 comm

r/MachineLearning· COMMUNITY

AI/ML Conferences [D]

Reddit discussion on fairness and consistency issues in peer review at top-tier ML conferences like ICML 2026.

u/msgs008·7 days ago·30 pts / 29 comm

r/MachineLearning· COMMUNITY

Chinese nexus/network in A* conferences rejecting non chinese papers [D]

Reddit discussion alleging nepotism and citation bias by Chinese researchers in top-tier ML conferences via coordinated networks.

u/AppropriatePush6262·7 days ago·34 pts / 11 comm

r/MachineLearning· COMMUNITY

Seems ICML is rejecting MANY unanimous positively rated papers [D]

Reddit discussion alleging ICML 2024 rejected many unanimously positive papers due to reviewer incentive misalignment during rebuttal phase.

u/AffectionateLife5693·7 days ago·31 pts / 42 comm

r/MachineLearning· COMMUNITY

[R] Joint Embedding Variational Bayes (TMLR ’26)

TMLR paper introduces Joint Embedding Variational Bayes, a probabilistic framework for non-contrastive representation learning via factorized embedding likelihood.

u/ISwallow5Gum·7 days ago·81 pts / 5 comm

r/MachineLearning· COMMUNITY

Is Attention sink without Positional Encoding unavoidable? [D]

Researcher observes attention sink artifacts in Transformers without positional encoding; seeks alternatives for query-conditioned attention mechanisms.

u/PreetamSing·7 days ago·30 pts / 26 comm

r/MachineLearning· COMMUNITY

ICML 2026 Decision [D]

ICML 2026 paper decision thread for community discussion and updates.

u/007noob0071·8 days ago·33 pts / 30 comm

r/MachineLearning· COMMUNITY

An interactive semantic map of the latest 10 million published papers [P]

I built a map to help navigate the complex scientific landscape through spatial exploration. How it works: Sourced the latest 10M papers from OpenAlex and generated embeddings using SPECTER 2 on titles and abstracts. Reduced dimensionality with UMAP, then applied Voronoi partitioning on density peaks to create distinct semantic neighborhoods. The floating topic labels are generated via custom labelling algorithms (definitely still a work in progress!). There is also support for both keyword and semantic queries, and there's an analytics layer for ranking institutions, authors, and topi...

u/icannotchangethename·8 days ago·35 pts / 6 comm

r/MachineLearning· COMMUNITY

Stanford Paper review [D]

Reddit discussion on Stanford Paper Review tool; user seeks community feedback on reliability of AI-assisted paper review suggestions.

u/Few-Annual-157·8 days ago·30 pts / 14 comm

r/MachineLearning· COMMUNITY

What is the scientific value of administering the standard Rorschach test to LLMs when the training data is almost certainly contaminated? (R) + [D]

A recent paper published in *JMIR Mental Health* (Csigó & Cserey, 2026) caught my attention. The researchers administered the 10 standard Rorschach inkblot cards to three multimodal LLMs (GPT-4o, Grok 3, Gemini 2.0) and coded their responses using the Exner Comprehensive System. They analyzed the models' "perceptual styles," determinants (like human movement vs. color), and human-related content themes. However, I am seriously struggling to understand the methodological validity of this setup, and I’m curious what the scientific community thinks. My main concerns are: Massive Data Cont...

u/Impossible_Echo4029·9 days ago·30 pts / 9 comm

r/MachineLearning· COMMUNITY

Visualizing Loss Landscapes of Neural Networks [P]

Hey r/MachineLearning, Visualizing the loss landscape of a neural network is notoriously tricky since we can't naturally comprehend million-dimensional spaces. We often rely on basic 2D contour analogies, which don't always capture the true geometry of the space or the sharpness of local minima. I built an interactive browser experiment [https://www.hackerstreak.com/articles/visualize-loss-landscape/](https://www.hackerstreak.com/articles/visualize-loss-landscape/) to help build better intuitions for this. It maps how different optimizers navigate these spaces and lets you actually visualiz...

u/Hackerstreak·9 days ago·40 pts / 5 comm

r/MachineLearning· COMMUNITY

What do reviewers actually mean when they say the paper sound more like a technical report? [D]

Reddit discussion on peer review feedback distinguishing research papers from technical reports; meta-commentary on publication standards.

u/obliviousphoenix2003·10 days ago·30 pts / 19 comm

r/MachineLearning· COMMUNITY

How do you test AI agents in production? The unpredictability is overwhelming.[D]

QA engineer discusses challenges testing non-deterministic LLM agents in production, seeking rigorous evaluation methods beyond traditional assertion-based testing.

u/this_aint_taliya·10 days ago·32 pts / 22 comm

r/MachineLearning· COMMUNITY

Maths vs machine learning publishing venues [D]

Reddit discussion on publishing theoretical CS research in ML venues vs. math journals; seeks guidance on journal selection.

u/Agile_Actuary_8246·10 days ago·30 pts / 12 comm

r/MachineLearning· COMMUNITY

freshman in ML: how do you identify actually open research problems? [D]

Reddit discussion: ML freshman seeks advice on identifying genuine open research problems vs. solved/vague ones.

u/Shonku_·10 days ago·33 pts / 22 comm

r/MachineLearning· COMMUNITY

Can Geometric Deep Learning lead eliminate the need of "Brute Force" pre-training [D]

Discussion on whether Geometric Deep Learning's built-in symmetries can reduce reliance on massive pre-training compute by encoding invariances directly in architecture.

u/Amdidev317·11 days ago·31 pts / 12 comm

r/MachineLearning· COMMUNITY

Why do only big ML labs dominate widely-used models despite many open-source pretrained models smaller labs could do RL on? [D]

Discussion of why major labs dominate deployed models despite open-source pretrained models being available; questions if RLHF accessibility should enable smaller labs to compete.

u/boringblobking·11 days ago·34 pts / 24 comm

r/MachineLearning· COMMUNITY

Introducing AutoMuon, a one line drop in for AdamW [P]

Hey everyone, I've been working on a small Python package called AutoMuon that makes the Muon optimizer usable as a drop-in replacement for AdamW in arbitrary PyTorch training pipelines. The core idea is relatively simple: Muon works primarily on 2D weight matrices (linear projections, conv layers) on hidden states, but you still need AdamW for embeddings, norms, and biases, etc. AutoMuon scans your model at init, figures out the right optimizer for each parameter automatically. I am open to PRs, especially for expanding the module-type exclusion list if you hit edge cases in your architect...

u/Skye7821·12 days ago·30 pts / 5 comm

r/MachineLearning· COMMUNITY

NoTorch: Neural networks in pure C (2-file library, BitNet 1.58) [P]

NoTorch: minimal C library for neural network training/inference without PyTorch dependencies, demonstrated on nanoGPT port.

u/ataeff·12 days ago·31 pts / 11 comm

r/MachineLearning· COMMUNITY

How to find to 'collaborate' with Professors to get funding for my research papers? [D]

Reddit discussion: individual seeking academic collaboration and conference funding support for research papers.

u/Erika_bomber·13 days ago·30 pts / 11 comm

r/MachineLearning· COMMUNITY

There Will Be a Scientific Theory of Deep Learning [R]

14-author perspective paper argues a scientific theory of deep learning is emerging, synthesizing five recent research lines to explain why large learning systems work.

u/dot---·13 days ago·33 pts / 10 comm

r/MachineLearning· COMMUNITY

Everything is so casual at CS Conferences. Why charge exorbitant registration fees? [D]

Reddit discussion criticizing high registration fees and declining standards at ICLR and other ML conferences.

u/casualcreak·13 days ago·30 pts / 28 comm

r/MachineLearning· COMMUNITY

Research taste is a skill nobody talks about. How do you develop it without collaborators? [D]

Reddit discussion on developing research taste: prioritizing problem selection and simple baselines over technical complexity in ML research.

u/Odd-Donut-4388·13 days ago·31 pts / 12 comm

r/MachineLearning· COMMUNITY

Is the ds/ml slowly being morphed into an AI engineer? [D]

Agents are amazing. Harnesses are cool. But the fundamental role of a data scientist is not to use a generalist model in an existing workflow; it's a completely different field. AI engineering is the body of the vehicle, whereas the actual brain/engine behind it is the data scientist's playground. I feel like I am not alone in this realisation that my role somehow got silently morphed into that of an AI engineer, with the engine's development becoming a complete afterthought. Based on industry requirements and ongoing research, most of the work has quietly shifted from building the engine t...

u/The-Silvervein·13 days ago·32 pts / 8 comm

r/MachineLearning· COMMUNITY

[New Optimizer] 🌹 Rose: low VRAM, easy to use, great results, Apache 2.0 [P]

Rose: stateless PyTorch optimizer with low VRAM footprint and fast convergence, released under Apache 2.0.

u/ECF630·13 days ago·31 pts / 16 comm

r/MachineLearning· COMMUNITY

ICML 2026 - Final Predictions on Average Score Needed Before Scores Come Out in 1 week? [D]

Reddit discussion speculating on ICML 2026 acceptance score thresholds before notification on April 30.

u/Fit_Scale_1464·13 days ago·30 pts / 47 comm

r/MachineLearning· COMMUNITY

We benchmarked 18 LLMs on OCR (7k+ calls) — cheaper/old models oftentimes win. Full dataset + framework open-sourced. [R]

Benchmark of 18 LLMs on OCR tasks (7.5k calls) shows smaller/older models often outperform expensive flagships; open-source dataset and evaluation framework released.

u/TimoKerre·14 days ago·42 pts / 19 comm

r/MachineLearning· COMMUNITY

CVPR - How to identify if an accepted paper has ethical issues (plagiarism)? [D]

Reddit discussion: researcher reports CVPR 2026 paper reproduces their June 2025 arXiv work with identical equations but insufficient citation; seeks guidance on plagiarism.

u/sukays·16 days ago·42 pts / 25 comm

r/MachineLearning· COMMUNITY

[NeurIPS 2026] Will you be submitting your code alongside your submissions? [D]

NeurIPS 2026 thread: researchers debate whether to submit code alongside papers given trade-offs between credibility and plagiarism risk.

u/undesirable_12·16 days ago·39 pts / 41 comm

r/MachineLearning· COMMUNITY

Bulding my own Diffusion Language Model from scratch was easier than I thought [P]

Reddit user implements diffusion language model from scratch on M2; produces toy results on Shakespeare dataset.

u/Encrux615·16 days ago·56 pts / 11 comm

r/MachineLearning· COMMUNITY

How exactly one goes about networking in conferences? [D]

PhD student asks for advice on networking at ICLR conference for internship opportunities.

u/howtorewriteaname·17 days ago·76 pts / 29 comm

r/MachineLearning· COMMUNITY

Are we optimizing AI research for acceptance rather than lasting value? [D]

Commentary on AI conference culture prioritizing acceptance metrics over reproducibility and lasting research impact.

u/NuoJohnChen·17 days ago·93 pts / 51 comm

r/MachineLearning· COMMUNITY

Does submitting to only journals negatively affect research career after finishing PhD? [D]

Reddit discussion: job-seeker asks if publishing only in journals (TMLR, JMLR) vs. conferences affects hiring prospects for corporate ML research roles.

u/dontknowwhattoplay·17 days ago·30 pts / 51 comm

r/MachineLearning· COMMUNITY

[D] It seems that EVERY DAY there are around 100 - 200 new machine learning papers uploaded on Arxiv.

Discussion of arxiv paper volume growth and strategies for staying current with machine learning research.

u/NeighborhoodFatCat·17 days ago·147 pts / 56 comm

r/MachineLearning· COMMUNITY

C++ CuTe / CUTLASS vs CuTeDSL (Python) in 2026 — what should new GPU kernel / LLM inference engineers actually learn?[D]

NVIDIA shifts GPU kernel engineering tooling from C++ CUTLASS to Python CuTeDSL; debate on learning priorities for inference engineers.

u/Daemontatox·17 days ago·38 pts / 16 comm

r/MachineLearning· COMMUNITY

1,200 ICLR 2026 Papers with Public Code or Data [R]

Curated list of ~1,200 ICLR 2026 papers (22% of 5,300+) with public code or data links.

u/Lonely-Dragonfly-413·18 days ago·57 pts / 17 comm

r/MachineLearning· COMMUNITY

Advice on becoming a research engineer [D]

Staff engineer considering career transition to research engineer role; seeks advice on feasibility.

u/ArtisticHamster·18 days ago·47 pts / 68 comm

r/MachineLearning· COMMUNITY

What are the future prospects of Spiking Neural Networks (and particularly, neuromorphics computing) and Liquid Neural Networks? [D]

Undergrad asks about future of spiking neural networks, neuromorphic computing, and liquid neural networks.

u/GodRishUniverse·19 days ago·33 pts / 28 comm

← All Sources50 stories