NVIDIA's tradition and legacy has been to always produce a compelling mid range GPU appealing to the 'sweet spot', that is the optimal performance/value offer to the large majority of semi-casual,budget concious PC gamers who would still want decent gaming performance. 2015 is no exception with the GTX 960 replacing the GTX 760 and delivering what NVIDIA has promised of the card, a cool, quiet, over-clockable and value for money card.
In this review we compare the GTX 960 (specifically EVGAs SSC model with ACX 2.0 cooling) to both MSI's GEFORCE GTX 760 HAWK and RADEON R9 270X HAWK, the 960's predecessors from 2013 as well as an overview at how the GTX 960 performs at 4K compared to its bigger brother GTX 980.
We were able to confirm GTX 960 offers more performance at lower power.
Background
Upon release of the GTX 980/970 in September 2014, NVIDIA did not initially replace the GTX 750/ti whose purpose in the product stack was for value for money efficient mid range gaming for casual gamers, below the GTX 760 being the mainstream performance offering. It was obvious that a 960 would eventually be released to deliver a top to bottom stack of cards based on the 2nd generation Maxwell GPU core once the GTX 980 launched, bringing with it all the benefits or of Maxwell GM200 core.
Both the 970/980 proved excellent performers (as they should be, being the Geforce top dogs), delivered good value per dollar, ran relatively cool and were over-clockable. This set the benchmark, no pun intended for any further refreshes and versions of GM200 based GPU cores that would turn up in future graphics cards.
Conveniently, GTX 960 is half a GTX 980 both on paper and physically which makes some comparisons easier. Its almost half the price too.
Enthusiasts often expect too much from unannounced products, and to determine what to expect we need to look at things logicically. When we have a stack of hardware products, in this instance microprocessor performance based such as a CPU or GPU
The majority of sales and shipments are typically the more affordable,value oriented cards what NVIDIA refers to as the 'sweet spot' of price and performance. Whether it was the GeForce 256 of 2000 or the GTX 980 of 2014, the high end cards are for those power users and enthusiasts who can afford them. The price-point of these cards has not changed much since the original GeForce as there has always been cheaper offerings for those more value oriented gamers and enthusiasts.
By course of NVIDIA nomenclature, the sweet spot cards have typically named as model 6 on the scale from 2 to 9, with 9 usually being a ultra or GPU card for the '1%' who must have the best of the best. 1 to 3 are entry for basic graphics and legacy games, 4 and 5 offer entry level performance, 6 offers mainstream performance for the budget concious enthusiast, 7 and 8 are high performance for enthusiasts and 9 is the ultra or dual GPU offering for the extreme enthusiast.
Considering programmable, DirectX 9 GPUs such as the 6600 GT, 7600 GT, 8600, 9600, 260, 460, 560ti, 660ti and 760 have sold well and delivered the value and performance enthusiasts expect. At the time of writing the GTX 460 despite being a 2010 release can still deliver playable frame rates with good visuals at 1080p providing the detail settings are optimised.
This time around NVIDIA hopes the GTX 960 continues the trend and is intended to be an upgrade path for these older GPUS, with improved paper specs while at the same time requiring minimal power.
The GTX260 was one of the first mid GPUs that leveraged the design of its higher performing siblings models, utilising a large PCB, GPU die size and required two 6-pin connectors. This methodology of disabling cores and using a common PCB reduced time to market and offered increased performance at the expense of end user cost, power and heat. 460/560 continued this trend, while these parts specifically were OK compared to the 480/580 in efficiency, it wasn't until we saw the Kepler based GTX 660/660ti that signified a reversal of this trend with emphasis being placed on reducing cost, size, power.
Although 660/ti specifically needed 2 6-pin power plugs to work, many designs came on a shortened PCB to allow Add-In-Board makers to offer cheaper designs and for these cards to fit in a wider range of typical chassis. With Kepler in 2012, we finally saw a reversal of the ever increasing power budget and GTX 960 is the result of this reversal, with power requirements going full swing. Mainstream-Performance oriented GPU with a single power socket have been uncommon over the past few years.
We saw a reduction or stabilisation with TDPs for CPUs but the upward trend for GPUs was somewhat alarming especially to those keeping track of developments. For NVIDIA to improve their designs to such a degree and require significantly less power while still on the 28nm die process is an amazing feat.
We could have performed an 'upgrade' style test and article comparing GTX 460 and 960 but for this particular segment the 960 with its higher pixel and texture fillrates (ie how fast the GPU can render 3D graphics to the screen),similar memory bandwidth and reduced power requirements is just superior on paper, there's also the five plus years of development, a long time
Comparing against the previous generation GTX 760 and equivalent AMD Radeon which we did has its caveats too...
With Kepler based 600 series cards of 2012, we had a GTX 660 which was half a 680, a 'cut' die which is physically smaller and a GTX 660ti which offered higher performance and used the full GPU die from the GTX 680 with some Streaming Multiprocessor (abbreviated as SMX for Kepler, SMM for Maxwell) disabled to provide some differentiation.
For the 700 series update to Kepler, retail received the 1152 core GTX 760 and OEMs the 1344 core GTX 760 Ti, both based on the GK104 core and having half the core count of a GTX 780 and GTX TITAN respectively. The obfuscation of the 760 Ti is a little confusing but given the 700 is a refresh with higher performance the change in model line-up is not unexpected. The 760Ti may have been kept out of retail due to cost and cannibalisation concerns. 860 was skipped, being a Notebook GPU.
NVIDIA has been very careful to compare the 960 to the 660 (and 560/460) not the higher performance and cost Titanium editions, claiming 960 is 2X the power and the efficiency of the GTX 660 of 2012.
GTX 660 had 960 CUDA cores, 760 1152 and both 660ti/760 ti have 1344 of Kepler Generation. 960 has 1024 CUDA cores of Maxwell generation. Just on core count or design we cant compare directly as Maxwell is a re-architecture GPU, What about price? GTX 660 launched at US$229 and 760 at US $249, yet 960 at from US $199.
So comparing against the previous generation tells us if the manufacturer/designer's goals have been met and if a change in thinking delivers a better user experience, which may be at the expense of some overall performance yet at a gain of efficiency.
With a paper comparison against the most recent parts skewed, this leaves it up to benchmarks to see how much/if 960 is faster than a 760 and how much less power it uses. Even if 960 isn't faster or is even a little slower than 760, that's still a win as the card is cheaper new and consumes less power.
In some of our tests the Factory over clocked 960 was slower or the same as the factory over clocked 760 but considering we measured less idle and load power, GTX 960 is an example of where faster isn't necessary better !
The following table shows GeForce evolution and execution, we started with 200 'tesla' series, despite the 800 series (G80) having the first unified shader design. Even older chips such as NV40 and G70 had separate pixel and vertex shaders plus G80 based models do not line up with later generations easily.
GeForce | Year | Codename | TDP | Cores | TMU | ROPS | Mem Bandwidth | Memory Width | Pixel Fillrate | Texel Fillrate | TFLOPS |
---|---|---|---|---|---|---|---|---|---|---|---|
GTX 260 | 2008 | GT200 | 202 | 192 | 64 | 28 | 112 GB/s | 448-bit | 16 GigaPixel/s | 37 GigaTexel/s | 715 |
GTX 460 | 2010 | GF104 | 160 | 336 | 56 | 32 | 115 GB/s | 256-bit | 22 GigaPixel/s | 38 GigaTexel/s | 907 |
GTX 560 | 2011 | GF114 | 150 | 336 | 56 | 32 | 128 GB/s | 256-bit | 26 GigaPixel/s | 50 GigaTexel/s | 1277 |
GTX 660 | 2012 | GK106 | 140 | 960 | 80 | 24 | 144 GB/s | 192-bit | 24 GigaPixel/s | 79 GigaTexel/s | 1882 |
GTX 760 | 2013 | GK104 | 170 | 1152 | 96 | 32 | 192 GB/s | 256-bit | 31 GigaPixel/s | 94 GigaTexel/s | 2258 |
GTX 960 | 2015 | GM206 | 150 | 1024 | 64 | 32 | 112 GB/s | 128-bit | 39 GigaPixel/s | 72 GigaTexel/s | 2308 |
Radeon | Year | Codename | TDP | Cores | TMU | ROPS | Mem Bandwidth | Memory Width | Pixel Fillrate | Texel Fillrate | TFLOPS |
---|---|---|---|---|---|---|---|---|---|---|---|
HD 7850 | 2012 | Pitcairn Pro | 130 | 1024 | 64 | 32 | 154 GB/S | 256-bit | 28 GigaPixel/s | 55 GigaTexel/s | 1761 |
R9 270X | 2013 | Curacao XT | 180 | 1280 | 80 | 32 | 179 GB/s | 256-bit | 32 GigaPixel/s | 80 GigaTexel/s | 2688 |
R9 285 | 2014 | Tonga Pro | 190 | 1792 | 112 | 32 | 176 GB/s | 256-bit | 29 GigaPixel/s | 103 GigaTexel/s | 3290 |
Source:Wikipedia, AMD, NVIDIA, and own sources. All cards 'reference'.
For this review, the MSI HAWK cards we compared our over clocked GTX960 to were also heavily over clocked, in the case of the of MSI GTX 760 HAWK, while its pixel fillrate is slightly less than 960, its texture and memory throughput are significantly faster. If GTX 960 can outperform this 760 then that is an accomplishment.
Maxwell GM206 GPU - a smaller chip for cost and efficiency.
Like their card/board models, NVIDIA also assigns a numerical model number to their GPUs. A 6 is usually the companies 2nd or 3rd best chip in the product stack. GM206 carries over all the goodness from GM204 (GTX980/970) such as Colour and Memory compression, DirectX 12, Multi-Frame Sampled Anti-Aliasing(MFAA), Voxel Global Illumination (VXGI), HDMI 2.0, H265 hardware support, lower temps/power consumption and a improved video engine. G-Sync, DSR and GPU Boost are there as standard.
Apart from being half of a GM204, comprising half the compute units and memory controllers the only change is with NVIDIA's NVENC video engine, comprised of proprietary silicon that's not part of the 3D pipeline. For GTX 980, hardware h265 aka HEVC CODEC support was promised and enabled in the video drivers however the software to support it was not available and NVIDIA did not say much about the subject at launch. We were not able to test this functionality during out time testing the GTX 980 at launch.
NVIDIA have now stated GM206 has full h264 encode/decode support while GM204 will have h264 encode only. This is disappointing for those already with a 980/970 but good news for potential 960 owners as the functionality is confirmed. However again supporting software is required. The launch GPU having a bug or regression with the video hardware is not without precedent. The 8400/8600 follow-up to the 8800 included a significantly upgraded video engine and deleted the need for a separate 'i/o' chip that was necessary for the higher end parts, the 6600 also had upgraded video support compared to the original 6800 launch model.
Unlike GTX 970, the design for GM206 and GTX 960 is straight foward and does not utilise any workarounds for memory access as all hardware units in the chip(exactly half the full GM204/GTX 980) design are available for use. We measured slightly above 2GB usage in our testing, being able to fully utilise the 2GB frame buffer on the card.
Being the half-design it is, GM206 is actually physically small being only 227mm2, rather than being the larger chip with half the cores disabled. This is done to ensure the optimum yields of chips. If GM204 had been used for the GTX 960, many otherwise more expensive working chips which could have been used in GTX 970 or GTX 980 cards would have been wasted, driving costs up and production down.Thermals would not be optimum either due to the variance of disabled and enabled cores. A further side effect of selectively enabling cores on the bigger GPU chip to meet a market segment also runs the risk of segmented/asymmetrical memory access, which is currently extremely controversial for the GTX970.
GM206 will be able to support larger memory capacities due to unused solder pads on some of the GTX 960 samples we have seen. For GTX 960 specifically, it does not have the horsepower to run games at 4K with high to ultra details so the extra memory may be debatable.
By making a dedicated smaller chip, although with a large upfront cost to design allows NVIDIA to optimise production costs and yields, therefore maximising availability of all three 2nd gen maxwell Desktop GPUs to market. GM206 is smaller and therefore more dense than GM204 packing in , which means higher temperatures when the chip is working hard, although smaller than GTX980 the EVGA over clocked sample we tested did reach to the high 70s in load testing, and 80s when over clocked.
Explaining and Understanding the actual CUDA/shader cores and the low level functionality of a graphics pipeline is beyond the scope of this review but we can still discuss the logical blocks and modules that make up a modern programmable GPU. You may hear talks of 'cores,clusters, TMUs and ROPs' and while finite understanding is not necessary, understanding their place in a GPU logically can visualise the differences between smaller and larger GPUs. We have annotated NVIDIA's standard GPU block diagram of maxwell to highlight each module.
Although we talked about the fundamentals of the Maxwell architecture, the block diagram only represents the 'compute core' of the GPU, there is other logic that is not illustrated such as the NVENC video engine, display controllers, power management, crossbar interface between the different functional nits and telemetry all of which combine to make a GPU. These need to be taken into account when considering the transistor count for the chip. The transistor count for this logic is constant between different GPU designs of the same family.
At launch time, the following NVIDIA partners have boards ready
- ASUS - STRIX
- EVGA - ACX 2.0 cooler line Reference Single fan, Superclocked single fan, Reference dual fan, Superclocked dual fan, Super Superclocked dual fan, For the Win dual Fan (single fan models are short PCB for compact systems)
- Gainward - Overclocked Reference, Phantom dual fan, Phantom dual fan golden sample
- GALAX - Reference, Dual Fan OC, Dual Fan EXOC
- Gigabyte - Triple fan G1 Gaming
- Inno3D - Dual Fan HerculeZ X2, Triple fan iChill
- MSI - Reference, Dual Fan Armor, Dual Fan Gaming, Dual Fan 100 Mil Edition.
- Palit - Reference, Dual fan Jetstream, Dual fan Super Jetstream
- PNY - Reference (XLR8)
- POV - Reference (Trooper), Reference OC(Trooper Ammo)
- Zotac - Dual Fan, Dual Fan AMP! (OC)
Overall NVIDIA claims GTX 960 is 50% faster than GTX 660 and 2x as power efficient, deep down, each GM206 Maxwell CUDA core is claimed to be 1.4x faster than a GK106 Kepler CUDA core and 2x performance per watt.
Multi-Frame Sampled Anti-Aliasing performance
One of the crown jewels of the 2nd Gen maxwell GPU family is MFAA, being a feature implemented on the GPU allowing anti aliasing patterns to be changed, promising a algorithm to deliver MSAA like quality at higher performance
We could not test this with GTX 980 as it took many weeks for the driver to be released and our card was returned before driver availability. When a driver was made available, the 'whitelist' of MFAA supported games was minimal and while the tech on paper was promising, the implementation led some to see the feature as a gimic.
With the launch of GTX 960,NVIDIA has changed this. The white list is gone, all DirectX10/11 games that support MSAA will work with MFAA with the exception of Dead Rising 3, Dragon Age 2 and Max Payne 3 as of Jan-2015. This is great news and opens the door for added performance for thousands of titles.
NVIDIA promotes enabling MFAA with ease of one click using GeForce Experience however we do not see MFAA listed for all detected games. It is listed for Dirt3, Hitman Absolution, War Thunder, PlanetSide2, Crysis3, Battlefield but not Bioshock,Metro or either Batman.
An additional override is provided in the NVIDIA control panel, forcing MFAA on for all MSAA software.
In Dirt 3 at 1200p Ultra 4X, Enabling MFAA on top of the in-game MSAA setting results using GFE in 108.83 FPS Minimum versus 102.8 for standard MSAA
In Hitman Absolution at 1200p Ultra 4X, Enabling MFAA using the driver override switch gave us Min 29.126/AVG 35.433/Max 46 versus min 29.126 avg 36.66 max 44.66 for the standard setting.
To be fair, NVIDIA promote MFAA has giving the most benefit the higher the resolution, especially at 4K. This is just a highlight here and further report will dive into more details with MFAA.
EVGA GTX 960 SSC Edition Graphics Card
For launch coverage, NVIDIA sent our non-reference, typically over clocked cards to the media having decided not to sample the lower-clocked (and therefore lower performance) reference cards. In addition, the reference cards have short PCB and the standard black plastic cooler with blower fan that also came with reference 660/760s.
For 960, the ref cards have been relegated to the bottom of the food chain being the cheapest available starting at $199 and these will be clocked at 1126 MHz base,a minimum Boost speed of 1178MHz and 7010 MHz Memory clock.
We recived EVGA's 960 SSC card from NVIDIA, while other media either received the EVGA SSC or the ASUS strix.
For the purposes of evaluating the GTX 960 itself as a GPU the EVGA SSC is both a blessing and a curse. On one hand it has extremely high clock speeds out of the box, equivalent to what may be achieved by overclocking a reference card. On the other hand these high clocks (and the related higher voltages and temperatures) provide a somewhat biased and skewed comparison to other GPUs due to this overclock.
None the less, its great to see a high factory supported overclock on a graphics card and EVGA have done a good job here.
End users can make the choice to buy a slightly cooler less power consuming card or a higher clocked model. But for our review we will only be focusing on the EVGA SSC's 'default' clockspeeds and some light overclocking, rather than downclocking the SSC to reference speeds. Downclocking to ref provides some scientific comparison to other GPUs, but in the real world no owner will downclock, that far anyway.
EVGA have a number of special features on their ACX 2.0 cooler equipped series of GPUs. The most notable is the ACX cooler itself, which makes performance claims over 'reference single blower' cooler and EVGA's own previous ACX cooler. Apart from the 'obvious' such as more and better fan blades which many vendors claim EVGA also claim to have redesigned the fan motors, making them stronger and faster but the most important claim with ACX 2.0 are the double ball bearings and is very important for the longevity of the GPU.
Typically, GPUs have used the cheapest fans possible that suit the size requirements of the GPU. Higher quality, longer lasting fans are expensive and add to the total cost of a GPU that is 'not expected to last forever' anyway. Such low grade fans use sleeve bearings which are highly susceptible to wear but have an advantage of not
A sleeve bearing is basically a metal tube filled with mineral oil that the fan rotates in. Over time and with heat, the oil evaporates through the clip/sticker that holds the fan in place, causing friction making the fan stall or stop . Dust ingress makes the situation worse by blocking the bearing, binding with the lubricant and pitting the brass bearing tube. Sometimes re-oiling the bearings can fix the fans temporarily but expecting a end-user to oil the fans, for a computer card is somewhat unrealistic and fantastical in the 2000s.
Some vendors see cheaper fans as a non-issue, citing warranty period or expected life-span of the GPU. Reality doesn't work like this. Many end users keep a GPU for 5,10 years even longer and expect the fan to work. While PC chassis fans are standard and can easily be replaced, GPUs use bespoke fans which cannot be easily replaced therefore reliability is paramount otherwise the GPU itself can and will fail, as seen by the large number of failures of popular GPUs both from ATI/AMD and NVIDIA.
Here at NitroWare we have had many GPUs fail simply due to failed sleeve bearing fans. In some cases the GPU had overheated and damaged half of the video memory or the core itself.
Ball-bearing fans are more reliable at the expense of cost and noise. Never the less our priority here is to encourage reliability and EVGA get full marks here. Consumer-grade computer hardware is often a cost-cutting race to the bottom. Other segments such as enterprise or industrial often do not experience cost-cutting of critical components.
EVGA ACX 2.0 uses two of these fans topped on a triple heatpipe cooler, comprising of three non-nickel plated pipes flattened for optimal surface area contact. Other cards may use a s shape pipe to improve thermal transfer. The heatsink itself seems to be optimised for weight comprising of alu stampings simply clipped together to make the heatsink.
The cooling fans are completely off while the card runs at below 65C, apoart from power-on. Under heavy load a fan speed of 20% at 70 degrees C is typical
The PCB itself is a departure from other types. The power socket is inboard, 1/3rd of the way and some of the power delivery circuits (capacitors,chokes) have been relocated to the left of the card. Being an overclocked enthusiast oriented model we get a strengthening plate to keep the PCB straight and provide minimal heatsinking, on the TOP of the card. This is still functional, but leaves the back side of the card bare, which contains some circuits and memory. The higher up FTW model has a backplate.
NVIDIA's 'reference' display layout of one DL-DVI, one HDMI 2.0, and three displayport 1.2 is used plus a single SLI connector. An undocumented master/slave switch (likely for UEFI BIOS) and undocumented 2 pin header are also present.
EVGA include the typical 'enthusiast' bundle, of media endorsements for other EVGA products, a generic manual, a generic setup poster, a flyer for the ACX features and silent cooler, enthusiast poster, case badge, a standard DVI-VGA dongle, and a 2x 6 to 1x 8 power adapter for those users who may have a PSU in the 400-600W range with only 2x6 GPU power plugs.
The card is in flat black and has no LED or coloured accents. Some of our benchmarks run at uncapped high benchmarks in 3 or 4 digit range and as such the lack of frame rate capping/vsync induces slight coin whine from the card. Given NVIDIA/EVGA has some software controlled features to cap frame rates this shouldn't be a problem in real use.
Test Methodology
The aim of the benchmarks in this review is to compare our EVGA overclocked GTX960 sample to the GTX760 we last tested, MSI's HAWK model which equipped with 2x8 power connectors, expanded power delivery and a unlocked BIOS was also an 'overclockers dream' Because of the time between our 760 and 960 reviews, we used an older Ivy Bridge based 3770K testbed rather than 4790K or 5960X Haswell based systems. many end users with older fermi or tesla based GPUs waiting for an upgrade will likely still be running Nehalem, Sandy bridge or ivy bridge systems.
At the time, we compared the MSI NVIDIA GTX760 HAWK to the MSI AMD R9 270X HAWK, MSI offering these two competitive GPUs with basically the same PCB and cooling package, for those who want either brand. The AMD card had a bit more compute power and this showed in some tests. We kept the results of the 270X in and also used them in this review to give an idea of how the previous gen AMD mid end card compares.
NVIDIA told us that the 960 is comparable to AMD's newer mid end card R9 285 however we have not tested one nor could get one in-time for the GTX 960 review.
We focused on 1080p and 1200p ultra resolutions for standalone benchmarks , with some 13x7 medium and 1080/1200p low to ultra thrown in to show resolution and detail scaling for Dirt 3 and Bioshock.
We also performed an overall test at 4K ultra against a GTX980 using a 4790K system to show for high loads such as this resolution, hardware resources matter and a small card like a 960 despite being '2nd gen maxwell' just cannot handle these loads as it does not have enough cores, ROPs or memory.
Some of our tests Eg Batman may seem odd in the setting we chose. There is a reason for this. The NVIDIA endorsed games in our test suite utilise GPU accelerated PhysX physics on GeForce but CPU on AMD. The penalty can be up to 30 FPS on AMD therefore by disabling or minimising PhysX on comparative tests we can ensure testing is fair.
Testing was perfomed at Summer temperatures, between 28c to 32c ambient.
Test System Specs
Primary
Type | Model |
---|---|
Processor | Intel Core i7-3770K 3.5GHz (3.7 to 3.9GHz Turbo, all-core turbo disabled) |
Motherboard | ASUS P8Z77-V Pro, BIOS 2104 |
Memory | 16 GB - 4x 4GB Corsair Vegenance DDR3-2400 C10 CMZ16GX3M4A2400C10 |
Cooling | Coolermaster TX-3 |
AMD Graphics | MSI RADEON R9 270X HAWK 2GB Part Number 113-C6310100-X15 BIOS Version 015.040.000.000.002887 |
NVIDIA Graphics | MSI GEFORCE GTX 760 HAWK 2GB Part Number P2004-0010 BIOS Version 80.04.BF.00.00 |
NVIDIA Graphics | EVGA GEFORCE GTX 960 SSC 2GB Part Number 02G-P4-2966-KR BIOS Version 84.06.0D.00.60 |
System Drive | Seagate Barracuda ST3000DM001 3.5" 3TB SATA |
Aux Drives | Kingston HyperX 120GB SSD Seagate Backup Plus 1TB 2.5†External USB 3.0 WD My Book 3.0 2TB 3.5" External USB 3.0 PIONEER DVR-216 DVD-RW SATA |
Case | Corsair Vengeance C70 Mid Tower |
Power | FSP ‘Aurum Series’ AU-750M 750 Watts Power Supply 80PLUS GOLD |
Display | DELL Ultrasharp U2412M 24†LED backlit LCD Monitor / |
Operating System | Microsoft Windows 8.1 Professional 64bit with latest updates as of Jan 2015. |
Graphics Driver | AMD Catalyst 13.11 Beta NVIDIA GeForce 331.65 NVIDIA GeForce 347.25 (for GTX960) 344.65 (for GTX980) |
Storage Driver | Intel Rapid Storage 13.1 |
Secondary
Type | Model |
---|---|
Processor | Intel Core i7-4790K 4.0GHz (4.2 to 4.4GHz Turbo, all-core turbo enabled) |
Motherboard | MSI Z97 Gaming 7, BIOS 1.8 |
Cooling | Intel TS13X Liquid Cooler |
System Drive | Crucial M500 SSD 960GB |
Display | DELL Ultrasharp U2414Q 24†LED backlit LCD Monitor |
Storage Driver | Intel Rapid Storage 13.6 |
Benchmarked Game Settings
Software | Version | Test Settings |
---|---|---|
Alien V Predator | D3D11 Benchmark V1.03 | Texture Quality 3, Shadow Quality 3, Anisotropic 16, SSAO 1, Tessellation 1, Advanced Shadows 1, MSAA 4 |
Battlefield 3 | 1.6 | Operation Swordfish - Checkpoint 1 to 2, Ultra Defaults, No Vsync |
Batman Arkham City | 1.1 | DX11:MVSS & HBAO, High Tessellation, Very High Detail, All details on, Normal Physx. DX9:No Dx11, High , No Physx |
Bioshock Infinite | 1.1.24.21018 | Pre-sets Selected via benchmark menu |
Blender | 2.67 Nightly and 2.73 | BMW1M-MikePan |
Crysis | 1.2.1 | 1920x1200, Game Preset |
Crysis 2 | Maximum Edition Version 1.9 | Single Player Campaign 1920x1200,V-Sync and DX11 Enabled,System Spec Extreme |
Crysis 3 | Hunter Edition Version 1.3 | Intro Mission, 1920x1200, FXAA, Very High Tex, Very High Profile, Medblur, Lens Flares |
Cyberlink MediaEspresso | 6.7.4131_47226 | Both Hardware Encoding and Decoding Enabled for GPU, disabled for CPU big_buck_bunny_1080p_h264.mov. Bug in software prevents NVIDIA encode acceleration. |
Deus Ex Human Revolution | 1.3.643.1 | 1920x1200, MLAA, 16x AF, Soft Shadows, High SSAO, High DOF, Triple Buffering, Post Processing, Tessellation |
Dirt2 / Dirt 3 | 1.1 / 1.2 | 1920x1200, Ultra Pre-set, 4x MSAA |
Futuremark Benchmarks | All patched to latest versions | |
Grand Theft Auto IV | Title Update 7 | 1920x1200, Maximum details, 16x AFX, Detail distances 50,50,100, -norestrictions switch used if necessary |
Handbrake | Build 5893 and version 0.10 | |
Just Cause 2 | 1.0.0.2 | 1920x1200, Maximum Details on NVIDIA, 4x AA, 16x AF. |
Lost Planet 2 | Benchmark | 1920x1200, Maximum Details Including Motion Blur, 4x AA. DirectX 11 Mode. |
Metro 2033 | 1.01 | DX11 Very High Preset 4xMSAA 16xAF + DoF |
Resident Evil 5 | Benchmark 1.0.0.29 | 1920x1200, Maximum Details Including Motion Blur, 4x AA. DirectX 10 mode |
SPEC | SPECviewperf 11 | 1920x1200 , 4x AA |
Street Fighter IV | Benchmark 1.0.0.1 | 1920x1200, Maximum Details except Ink,4x AA, 16x AF |
Total War: SHOGUN 2 | 1.1 | All Ultra,MLAA, Soft Shadows, Tessellation, HDR,SSAO |
Trackmania Nations Forever | 2.11.26 | 1920x1200, VHQ Pre-set, 4x AA 16x AF |
Warthunder | 1.3 & 1.45 | Game pre-sets |
World In Conflict | Demo 1.0 | 1920x1200, Maximum Details, 4x AA, 16x AF. DirectX 10 Mode |
Benchmarks - GTX960 v GTX980 at 4K resolution
The purpose of this test is to show that specs and hardware resource matter. The GTX 960 is not a high end card or marketed as such. If you want to drive 4K at ultra you need the right GPU, whether its a GTX 980 or even a Radeon R2 290X.
Benchmarks - GTX 960 v GTX 760 & R9 270X
Overall System Performance benchmarks
Our PC Mark 8 scores are presented twice, as the versions used for GTX760/R9 270X are not comparable to the current, note the version numbers.
PC Mark comprises of productivity (word processing/office/DTP), Creativity (photo/video editing, video conferencing) and web publishing to generate an aggregate score. Each module is based on different code, for example the spreadsheet workloads use the code for LibreOffice's Calc Application (which can be accelerated using OpenCL)
Considering the cards power requirements and that the Radeon supports OpenCL 1.2 and 2.0, GM206 does OK in the OpenCL accelerated PCMark tests.
Both Passmark and Sandra also perform a suite of System tests including GPU Compute, and again GM206 is competitive.
We will not be getting into sub-scores in this review to see why and where certain apps perform as they do, instead focusing on overall scores for comparison purposes between these and other cards.
3D Benchmarks
Things get a bit more interesting when we look at 'traditional' 3D benchmarks rather than a bit of GPU comptue accelerated code
OpenGL furmark shows GTX 960 faster than GTX 760 at both 720p and 1080p but still slightly trailing the 'bigger' 270X
We include legacy versions of 3DMark (03,06) to show performance of older games and DirectX 9 in a comparable way. Remember that GM206 is 'memory bandwidth starved' compared to the two other cards, but in certain scenes compression can significantly increase memory bandwidth and throughput.
We see good results for our legacy 3DMark tests, and this trend continues all the way through to the DX11 family tests especially for shader heavy Firestrike, showing the claimed performance improvement in the low level CUDA cores.
This standard set of 3D Benchmarks provides a good baseline comparison for other hardware.
Furmark is according to the vendors a 'power virus' as it will max out the hardware resources of a GPU to 100% and stress it, where as a typical real-time game is a dynamic load and either will not load up the GPU to max or the load the vary. Some other commentators hate the term 'power virus' because furmark is a legitimate software benchmark but they are wrong. Very few applications will have 100% GPU and memory controller load. Additionally the load/stress will cause modern GPUs to throttle hard or turn off GPU boost and run at their base clock, as a CPU does. In games unless a thermal or power condition is tripped we typically always have some boost.
Driving and Racing
We use Dirt 3 as a quick and meaningful benchmark quite a fair bit at NitroWare for several reasons
- Reasonable DX 11 effects that wont bring a system to its knees
- Benchmark mode is realistic of actual game play and includes randomised AI
- Scales well on different CPU/GPUs and supports up to 8 CPU threads.
- Commonly owned and tested,quick to run
We have since depreciated Dirt2 (we used it as it was based on an older engine and showed slightly different trends) but 960 combined with the newer NVIDIA drivers shows a 1/3rd improvement in performance at 1200p Ultra with 4X MSAA which is substantial.
Trackmania is a popular free to play online racer using DirectX 9 but the age and nature of the game means it cant take advantage of the newest hardware and shows modest FPS even on systems such as a 5960X with a GTX980 or 290X, never the less again GM206 shows a gain here.
Now you may say, ah-hah! you only tested one resolution on Dirt3, how does it perform at lower resolutions, details and AA compared to the other cards? Well we have you covered here. For specific games we perform a top-to-bottom scaling test covering different resolutions, details and AA levels.
GTX 760 and R9 270X were very similarly matched at original time of testing, 135 FPS MIN for low resolution (likely CPU bound) and 75 v 81 FPS at Ultra with 8x AA.
In comparison at the low end 960 increases this by 10 fps and the top end by a whopping 10 to 15 FPS combined with memory intensive 8x AA, so much for being starved of bandwidth.
Overall, GTX 960 shows a 10-20 FPS increase over both GTX 760 and R9 270X.
Flight and Sandbox
Just Cause 2 and Warthunder are NVIDIA logo games while Sleeping Dogs is AMD.
Just Cause 2 uses compute shaders and DX10 so will show some trends on GPUs with different GPU compute performance characteristics.
Sleeping Dogs is video memory and horsepower dependant, including the optional high res texture pack. We do not use extreme AA as that enables Super Sample Anti Aliasing.
Warthunder is a popular free to play combat sim that includes heavy use of DX 11 effects for ground, air and water realism. A problem with this test is the developer is constantly updating the game automatically which can break backward comparisons.
Given a large portion of the warthunder tests are repetitive textures/shaders such as water, grass and metal ship hulls and aircraft skins, it is interesting to consider how much memory compression and caching have an affect here.
Visit our youtube channel for several 60 FPS 4K captured clips of Warthunder at its maximum detail levels. http://www.youtube.com/nitrowaredotnet
First Person
This compilation of first person shooter benchmarks includes an older benchmark set of which some we have also retired.Presented for backwards comparison purposes.
Note that we test performance GPUs at very high or ultra settings, therefore the frame rates mentioned may be worst case examples, regardless comparing the varying sample of games (and game engines) we have here, 960 gives get great performance for US $199, and yes it does run Crysis...
Bioshock Infinite, a DX11 title based on an Unreal Engine includes a replicable benchmark mode that flies around several maps including in the single play campaign. Unfortunately this benchmark does not include any AI and is mainly graphics focused.
We want to focus on min and average frame rates here, the max frame rates in this particular test are quite inconsistent and unrealistic, based on peculiarities in how the benchmark changes scenes.
The min fps are significant, and give a 'playable' experience on their own compared to the other cards, and like Dirt 3 we see about 20 FPS increase in average frame rates, great. Pity about the lack of AI in the benchmark though. Note that Bioshock was an AMD Gaming Evolved title.
Battlefield 3 Smoothness
Crysis 3 Smoothness
Third Person
Batman Arkham City is not only a popular game with users but popular with testers thanks to its PhysX support and benchmark mode. As we explained earlier, Being an NVIDIA endorsed game, GPU accelerate PhysX throws a spanner into the works when we compare with other graphics.
It is very important to consider GPU v CPU PhysX when comparing NVIDIA endorsed games, with the default NVIDIA driver setting of 'auto', providing the right game, enhanced PhysX will be performed on the GPU. Enthusiasts who have played either of the three Batman Games or Borderlands may be familiar with the visual enhancements GPU PhysX provides, to myself not having the added NVIDIA exclusive features makes the game look bland and ugly. If the GPU supports extra visuals for little or no penalty, why not use them.
We want above 60 FPS at Vhigh and Extreme Details and GTX960 gives us this, with Normal PhysX enabled. We did not test high physx for comparative reasons. The AMD Radeon lost 30 FPS alone when we enabled GPU PhysX, and would lose more if we uped this, the code is simply not optimised for high performance on non NVIDIA hardware.
Lost Planet 2 benchmark has been a stalwart of DX11 and Tessellation testing, we may be hitting a threading or vendor optimisation issue with this one based on the numbers.
Resident Evil5 is an older DX10 test which does have alot of enemy AI despite being path scripted. Over the years it has given some weird results on hardware which shouldn't but given the rates all 3 compared cards give performance clearly is no issue here.
Arcade/Strategy
Our Street Fighter 4 benchmark is setup to render as fast as possible. Speed is important for a fighter however they are locked at 60Hz. Unlocking the rate allows us to look at throughput. The title is also useful for testing whether certain weaker hardware can sustain 60 Hz without stutter, glitches or drops.
SHOGUN 2 from Sega is an interesting title, not just for the large scope the game encompasses for its age but the included benchmarks which focus on CPU and Graphics independently or combined. The game can generate and handle a large number of enemy AI so the system must be able to support this.
Being DX11 based, it can take full advantage of GM206 and Maxwell, again 20 FPS faster. Note the CPU score is within margin so there's no 'cheating' here.
We like to use World in Conflict as an RTS test due to its complex DX10 graphics, (explosions, clouds, destruction, lighting) and that it benchmarks an actual mission from the single campaign complete with enemy AI. It is a little dated but the graphics and system load still hold up 5yrs later.
GPU Compute
Despite 750ti being on the market for a year and GTX980 for 4 months, the GPUs are still new and software support is often lacking. Maxell GM200 chips support CUDA 5.2 so some developers may have to update their apps.
While Computemark LuxMark and Folding at home showed large speed-ups, the ubiquitous 'DirectCompute and OpenCL benchmark' gave us poor performance on DirectCompute and OpenCL was broken. Application support issue.
We are testing using the first driver for GM206 and things WILL change.
Coin Mining
Upon release the maxwell based GTX 750ti was a hit with coin miners due to its greatly improved compute performance per watt compared to older GEFORCE and for a period was the choice with coin miners. We present some coin mining benchmarks but this is not meant to be an endorsement that a GTX 960 is ideal for mining or should be used for coin mining, but simply a representation of launch performance.
At time of launch two tools support GTX 960, CUDAminer from 2014 works with maxwell and ccminer release 31 from 2015 natively supports GTX 960. ccminer is an evolution of cudaminer, supporting more ciphers. We last tried CUDAminer on the GTX980 and got about 560 mhash/s throughput with the GTX 960 now achieving 330, above the ballpark for its specs.
This is the first review we have used ccminer with. It defaults to 'x11' GPU mining and achieves 5020kH/s in benchmark mode on EVGA GTX 960 SSC.
Video Encoding
We do not have the GPU enabled software yet to test the hardware enabled h265 encode/decode features of Maxwell, but the updated NVENC performance for H264 that's in the 2nd gen GM200 based Maxwell chips does work when paired with NVIDIA Shadowplay, which can capture 4K at 60 FPS. We demonstrated this with GTX 980, but thats is more of a 'real time' encoding than an accelerated encoding which some users are interested in.
We tried our usual accelerated tests however which did show some improvement.
Windows Movie Maker WILL encode and render faster with bigger,faster GPUS and will perform poorly, sometimes slower than CPU only on certain integrated GPU systems. The hardware utilisation can be viewed using tools like GPU-Z
WMM is old and pre-dates the newer video engines in AMD and NVIDIA GPUs which need specialist API support to access, however it will 'try' to use whatever is available providing the graphics driver supplies the right CODECs and API access.
How do we know it isn't just leverage the CPU in some way, swapping GPUs on the same CPU does show us the difference in addition to GPU utilisation.
For example from our 5960X Haswell-E review, in WMM A 5960X paired with a GTX 460 which does not have NVIDIA's new NVENC video processor encodes our test workflow in 23 seconds, swapping in a GTX 980 with the same graphics driver speeds this up to 8.5 seconds.
Due to a bug introduced into MediaEspresso which breaks NVIDIA support we could not fully test NVIDIA acceleration, however note the scores for GPU Decode, GTX980 is as fast as Intel QuickSync when told to decode the GPU as quickly as possible, its 2-3x as fast as the CPU doing both the encode and decode. CPU only tests are included for reference.
Some may say this is a GPU review, what is handbrake doing here its a CPU test and you'd be wrong. Both Intel and AMD have dedicated themselves to improve handbrake heavily resulting in the version we have today which includes DXVA (GPU Acceleration) as well as OpenCL and QuickSync support.
Handbrake includes a fine tuned Intel-Dedicated h264 CODEC so other solutions may never be as fast. GTX 960 improves on OpenCL compared to Radeon. Handbrake does not support CUDA.
Professional Graphics
The Cinema4D graphics engine in Cinebench is used for professional rendering so we do include results here which do not support an improvement but at least there is no decrease especially considering core count and TDP. The behaviour with OpenGL in R11.5 is typical to what we see, under performing due to the software.
Blender 2.73a claims support for 900 series GeForce Cards but clearly the software isn't fully optimised for Maxwell yet. We had similar issues with GTX 980 but at least there, with a slightly older Blender, CUDA wasn't slower than CPU but on 2.73 it is slower.
The thing is here, Blender advertise frequent optimisations to CPU render times for newer builds , we see this with CPU render (40s faster) and are forced to use these newer builds to support our new GPUs, but the support isn't right yet.
OpenCL support is not functional, we can see the GPU compute acceleration working by comparing against a run of RatGPU which shows a increase.
In the real world, some users do use professional workstation apps with consumer grade GPUs and students can't afford a Quadro or FireGL yet alone a Workstation. Ourselves and a small number of other review sites still carry on with SPEC's workstation benchmarks. A number of ISVs have recognised this and opened up their sw to consumer GPUs.
GTX960 runs Viewperf 11 great compared to the higher powered and more expensive GTX 760 Hawk. Architectural and driver differences are why AMD is faster here.
Power
'Power and Efficiency' is what NVIDIA have been hailing from the hills since the release of the 750Ti and 1st gen Maxwell.
The big question is does GM206 draw less power than previous GPUS? Yes, but, we would expect it to, as most GTX 960s have only a single power connector and vary from 120 to 150W TDP. Note that our Firestrike performance was faster than the others and the card used less power getting there.
Manually setting the fan to 100% speed uses an extra 5 watts of power. users should never see 100% in any normal use and we don't recommend it be used anyway.
Maxwell, like Kepler features more power state levels so that the GPU can clock as low as possible during desktop idle or light load as well as shutdown unnecessary gates. AMD have typically used fixed 300 MHz speed for their idle clock.
Overclocking
We can overclock the EVGA 960 two ways, one of which we have seen for the first time in EVGA Precision X 16 (which is made by EVGA and not rivatuner) , which was not present in MSI Afterburner, EVGA KBOOST and traditional overclocking using PrecisionX or the tool of your choice.
EVGA KBOOST mode forces GPU BOOST 2.0 to run at a high speed full-time rather than throttle depending on dynamic load. We have not seen this mode before on competitor brands and the caveat here is simply heat and power as well as a slightly lower peak clock speed.
We monitored 59C in Bioshock and 74c in Dirt 3 with a slight throttle to 1418 because of the temperature. Since gaming loads (versus stress tests) typically don't end up overloading the GPU and resulting in boost being disabled, in the quick testing we saw little to no difference for forcing GPU BOOST on, and may have limited benefit primarily for benchmarking as described by EVGA.
EVGA KBOOST PERFORMANCE - GTX 960 SSC
Test | Without KBOOST | With KBOOST |
---|---|---|
Bioshock | Min 28FPS AVG 72 FPS | Min 27FPS AVG 77 |
Dirt3 | Min 103 AVG 128 | Min 105 AVG 127 |
3DMark Firestrike | 6862 | 6798 |
Traditional overclocking is of course supported and we have up to +100mv of overvoltage. Since the EVGA SSC model is already Boosted we could only achieve a 50MHz core and 50MHz Memory overclock in very limited testing. Furmark would freeze above 50MHz even with the full +100mv overvolt.
50MHz OC,gave us a 1480 Boost speed, which dropped to 1468 during Bioshock Infinite benchmark, at 72c. Dirt3 also ran at 1468 and 72c.
EVGA OVERCLOCK PERFORMANCE - GTX 960 SSC with 50MHz overclock, 110% Power limit, 94c temp limit and +12mv
Test | Result |
---|---|
Bioshock | Min 23.26 AVG 78.15 |
Dirt3 | Min 112.5 AVG 132.6 |
3DMark Firestrike | 6967 |
Furmark | 2623, 44FPS,83c |
Pricing, Availability and Support
The official pricing for GTX 960 is US $199 with the RRP for the EVGA SSC being US$209. In retrospect the GTX 760 launched as US $249 and GTX 660 at US229. The base price and the overclock markup is fine, but NVIDIA does not issue regional RRPs and as of time of writing some exchange rates are low meaning more expensive cards in some regions.
EVGA GTX 960 SSC Availability
Region | Local Price | Converted Price in USD |
---|---|---|
AUS - staticice.com.au | AUD $ 319 | $ 248 |
US - newegg.com | USD $ 210 | $ 210 |
EU - geizhals.at | EUR € 215 | $ 243 |
At the time of going to press, the cheapest GTX 960 in Australia is priced at $AUD $269 inc tax according to staticice.com.au which equates to $USD 208.
Radeon R9 285 Availability
Region | Local Price | Converted Price in USD |
---|---|---|
AUS - staticice.com.au | AUD $ 307 | $ 239 |
US - newegg.com | USD $ 210 | $ 210 |
EU - geizhals.at | EUR € 192 | $ 217 |
EVGA have company based extended on-line support and warranty however this program has not been extended to Australia according to EVGA.
The support information on the sample we received referred Australians to contact EVGA Taiwan and to not return the Graphics card to point of purchase. This is incorrect and Australians should return the product to point of purchase first for exchange repair or refund.
Verdict
Cards like the 6600GT, GTX460, 660 and now 960 are aimed at the budget concious enthusiast /gamer who is willing to spend 200-300 on a new graphics card. Not everyone can afford, wants or needs a US $550 flagship card in whatever guise it turns out to be. Things get more complicated when we consider exchange rates, local mark ups, taxes and supply and demand. In Australia, flagship GeForces have frequently neared or exceeded AU$1000 and currently GTX980 is no less than $650.
Admittedly, I myself am in the target demographic for the GTX 960 and have purchased the 6600 GT and GTX4 60 for my own PC in the past as these met my budget. US $200 is a good price for what is almost half a GTX980. I have had a GTX 460 and a Radeon HD7850 in my personal machine (being AMD's sweet spot card from 2012) and seeing the performance the GTX 960 achieved here I would be happy had I upgraded from the 460 to the 960 for the money.
While GTX 960 is interesting and efficient for the power, the value proposition for performance per dollar may change very soon. AMD has been known to offer aggressive price cuts or value adds such as their game program, and while NVIDIA also has a game program both typically expire or are introduced depending on pricing for particular new cards
There are also potential future announcements from both vendors which may sway some purchasers,'if I only pay xx dollars more I can have a better GPU'
NVIDIA have again delivered on their constant execution of timely, fast and efficient products. The NVIDIA silicon itself, the software and the third party integration by EVGA are all very good. The card worked flawlessly out of the box using WHQL launch drivers plus EVGA offer not only a very high clock speed for their OC models but offer some great software which allows the card to be further pushed, up to its thermal and power safety limits.
If you have a fermi based 450 to 560 or even older and want a mid range upgrade, now is the time and GTX 960 is the card. Our testing proved NVIDIA's claims of performance and power efficiency over Kepler based GeForce architecture.
Although similarly priced to AMD's R9 285, GTX 960 offers more,newer hardware features such as HDMI 2.0, 3 display port, G-SYNC, MFAA, VXGI and 0 RPM Fans.
There has been some controversy and hate over the GTX 970 lately in on-line communities, at the end of the day we can explain the pros and cons of both brands, you the enthusiast should take this on-board and decide which brand suits your needs. I will repeat what I wrote on twitter this week.
https://twitter.com/NitroWare/status/561041542677139457
https://twitter.com/NitroWare/status/561042358691581952
EVGA GTX960 SSC Edition Pros
- ACX ball bearing cooler with silent fan below 65c
- High factory overclock
- EVGA support
- EVGA PrecisionX Software
- New hardware features introduced with Maxwell GPU
EVGA GTX960 SSC Edition Cons
- Styling somewhat bland, may not appeal to some enthusiasts
- APAC regional support needs some improvement
- Uses slightly more power,heat and noise than lower clocked models
- Not an upgrade path for GTX 760 owners.