News
These new tools provide step-by-step explanations, solutions, and interactive 3D models to aid visual learning for STEM (science, technology, engineering, and math) subjects.
Welcome to the official repository for DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision-Language Models.This repository contains the code, resources, and ...
We describe the development of a visual model to represent the implementation of an ambitious mathematics program, which serves as an example of a complex educational reform. Visual models can be both ...
The evaluation on MATH VERSE highlighted that, while models like Qwen-VL-Max and InternLM-XComposer2 experienced a boost in performance (over 5% accuracy increase) without visual inputs, GPT-4V ...
Large Language Models (LLMs) and Large Multimodal Models (LMMs) exhibit impressive problem-solving skills in many tasks and domains, but their ability in mathematical reasoning in visual contexts has ...
The latest round of language models, like GPT-4o and Gemini 1.5 Pro, are touted as "multimodal," able to understand images and audio as well as text. But a new study makes clear that they don't ...
In this paper, we study the capability of visual context-based mathematical reasoning within the rapidly evolving field of Large Multimodal Models (LMMs). Achieving visual context-based mathematical ...
Although diffusion models advance condition-based visual generation, they suffer from speed and cost issues, unlike faster AutoRegressive methods that are limited in performance. To address these, we ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results