Logarithmic Models in Math

News

Researchers find that large language models struggle with math

Mathematics is the foundation of countless sciences, allowing us to model things like planetary orbits, atomic motion, signal frequencies, protein folding, and more. Moreover, it’s a valuable ...

The Washington Post2y

Program so accelerated that 8th-graders take AP Calculus. Will that work?

What happens when two bright and conscientious parents, without planning to do so, create one of the most ambitious math acceleration programs in the country? Jason and Sandy Roberts started the ...

Ars Technica6mon

New secret math benchmark stumps AI models and PhDs alike

The benchmark tests AI language models (such as GPT-4o, which powers ChatGPT) against original mathematics problems that typically require hours or days for specialist mathematicians to complete.

ExtremeTech1mon

Microsoft's Phi-4-Reasoning Models Bring AI Math and Logic Skills to Smaller Devices

According to Microsoft, the Phi-4-reasoning model has 14 billion parameters and is trained to handle complex and multi-step reasoning tasks such as mathematical problem-solving and scientific ...

TechCrunch7mon

Researchers question AI’s ‘reasoning’ ability as models stumble on math problems with trivial changes

[W]e investigate the fragility of mathematical reasoning in these models and demonstrate that their performance significantly deteriorates as the number of clauses in a question increases.

TechRepublic1mon

OpenAI’s o3: AI Benchmark Discrepancy Reveals Gaps in Performance Claims

OpenAI’s o3: AI Benchmark Discrepancy Reveals Gaps in Performance Claims Your email has been sent The FrontierMath benchmark from Epoch AI tests generative models on difficult math problems.

Hosted on MSN1mon

Microsoft's Phi-4-Reasoning Models Bring AI Math and Logic Skills to Smaller Devices

Microsoft has introduced a new set of small language models called Phi-4-reasoning, Phi-4-reasoning-plus, and Phi-4-mini-reasoning, which are described as "marking a new era for efficient AI.

Results that may be inaccessible to you are currently showing.

Hide inaccessible results