GPT-4 Can Score 99% on Graduate Record Examination Verbal (GRE-Verbal)
Updated: Apr 20
GPT-4, is the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.
For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%. OPEN AI spent 6 months iteratively aligning GPT-4 to produce their best-ever results (though far from perfect) on factuality, steerability, and refusing to go outside of guardrails.
Over the past two years, they rebuilt our entire deep learning stack and, together with Azure, co-designed a supercomputer from the ground up for the workload. A year ago, they trained GPT-3.5 as a first “test run” of the system. They found and fixed some bugs and improved on theoretical foundations. As a result, the GPT-4 training run was unprecedentedly stable, becoming the first large model whose training performance we were able to accurately predict ahead of time. As they continue to focus on reliable scaling, we aim to hone our methodology to help us predict and prepare for future capabilities increasingly far in advance.
Enroll at HURU School's for the Artificial Intelligence Picodegree