Comparing DDR4 IC on Z170 platform – part 3 – 1.55 volts, safe and brave :)

This is the third post in DDR4 on Z170 series – for previous parts see part 1 and part 2, where I cover stock and safe settings, respectively. In this part I’m going to use settings which are beyond everyone’s safe range, however in my opinion this is just about max I could use 24/7 in my system with adequate cooling (at least 120mm blowing directly over RAM). This is quite brave, still not extreme though as we’ll see in the next parts. The settings from previous part have been optimized further, which is summarized in the table:

Changes

Let me quickly summarize what changes in the memory settings:

Memory Original frequency Original timings Voltage Optimized frequency Optimized timings New voltage OC frequency OC timings OC voltage
G.Skill TridentZ 3600C17 1800 MHz (3600 effective) 17-18-18-38 2T 1.35V 1800 MHz (3600 effective) 15-15-15-28 1T 1.4V 1800 MHz (3600 effective) 13-13-13-28 1T 1.55V
G.Skill RipjawsV 3200C16 1600 MHz (3200 effective) 16-16-16-36 2T 1.35V 1733 MHz (3466 effective) 15-17-17-28 1T 1.4V 1800 MHz (3600 effective) 13-17-17-28 1T 1.55V
HyperX Savage 2666C13 1333 MHz (1666 effective) 14-15-15-35 2T 1.35V 1400 MHz (2800 effective) 12-14-14-28 1T 1.4V 1500 MHz (3000 effective) 13-15-15-28 1T 1.5V
1466 MHz (2933 effective) 13-14-14-28 1T 1.45V 1533 MHz (3066 effective) 12-15-15-28 1T 1.55V

3600c13_155In addition to main timings and frequency change, I also tighten secondary and tertiary timings, so they are quite aggressive. This includes tCWL changed to 9 and – in case of B-die – TRFC to 260.

Test results are:

XTU

xtu_155

The progress continues in case of Samsung-based mems, both of them gain 15 points by moving from CL15 to CL13. Hynix break 1400 barrier, the less aggressive setting closing on to the tighter one.

Geekbench

g3_155
General results in Geekbench 3 are the same with Samsung ahead of Hynix, however 3200 caught up on 3600 despite of advantage of tighter TRCD/TRP on B-die. It is also possible that I experienced bad RTL training on B-die while testing this. In Memory subtests HyperX are now far behind due to too low operating frequency and it is also reflected in the overall score.

AIDA

aida_l_155

Latency test confirms theoretical improvement, both G.Skill mems improved 2 ns since previous test, while both HyperX gained “only” 1 ns.

aida_rw_155

Synthetic bandwidth tests also improve, B-die are almost maxed out so they don’t gain much, E-die almost catch them, while Hynix show significant boost, especially in first (less aggressive) setting – over 4 GB/s in each test.

SuperPi 32M

32m_155

Closing test is SuperPi 32M, where both Samsungs gain additional 3 seconds. Results for both HyperX settings are very similar and they don’t differ much from previous results, which might suggest that we’re reaching performance limit of Hynix MFR on air.

Voltage scaling

Starting from this part, I will show collective results for XTU and 32M as real life benchmarks on one chart to visualize voltage scaling. I went the easy way with charts here by cutting out the part which is not yet described.

XTU

xtu_scaling_155_v2

32M

32m_scaling_155_v2

This concludes tests with safe voltage. I was able to demonstrate possible gains in benchmarks, which were visible, but not as impressive when switching from XMP to optimized settings. Next part will include another voltage raise to 1.7V which is definitely beyond safe settings – remember that 1.65V was officially maximum recommended voltage to be used on previous generation of memory (DDR3). It will also be last part, where HyperX will be included as maximum of MFR on air will be reached.

Comparing DDR4 IC on Z170 platform – part 2 – 24/7 overclocking setting

This post is a second part in DDR4 on Z170 series. In previous part I covered the basic setup and baseline for tests. Here I will begin overclocking by raising memory voltage to 1.4-1.45 volts, which seems quite safe for 24/7 usage. In addition timing settings taken from XMP profile will be optimized and tightened.

Changes

Let me quickly summarize what changes in the memory settings:

Memory Original frequency Original timings Voltage Optimized frequency Optimized timings New voltage
G.Skill TridentZ 3600C17 1800 MHz (3600 effective) 17-18-18-38 2T 1.35V 1800 MHz (3600 effective) 15-15-15-28 1T 1.4V
G.Skill RipjawsV 3200C16 1600 MHz (3200 effective) 16-16-16-36 2T 1.35V 1733 MHz (3466 effective) 15-17-17-28 1T 1.4V
HyperX Savage 2666C13 1333 MHz (1666 effective) 14-15-15-35 2T 1.35V 1400 MHz (2800 effective) 12-14-14-28 1T 1.4V
1466 MHz (2933 effective) 13-14-14-28 1T 1.45V

In addition to main timings change as mentioned in the table above, subtimings were also optimized, including tightening of TCWL to 13 cycles, TRFC to 280 cycles and TRTP to 8 cycles. These settings are far more aggressive than default, although they are not still the tightest settings possible. Here’s the screenshot from the OS with TridentZ installed (most of the subtimings are identical for other sticks):
3600c15-timings

On to the tests:

XTU

xtu_safe

Compared to previous results, we can observe dramatic result improvement in all cases. XTU is very sensitive to memory setting and optimizations made here cause ~8% increase in achieved results. Judging from experience, this is a difference of around 200-300 MHz core clock. What’s also interesting is that RipjawsV kit using older E-die is almost able to match its bigger brother and optimized HyperX will be better than much faster G.Skills on stock (XMP) settings.

Geekbench

geekbench_safe

In Geekbench results are similar – overclocked HyperX will catch stock G.Skills, but Samsung-based RAM will run away when optimized. Overall result is 200-300 points better than on stock.

AIDA

aida_latency_safe

Starting with latency, we can see that the results are improving a lot. XMP settings on G.Skills gave latency measurements of around 43.5 ns and B-die based TridentZ gain 5 ns here, which is more than 10%. E-die also allows for improvement, but it’s substantially smaller, 2.5 ns and around 5%. Optimized HyperX again catch stock G.Skill.

aida_read_write_copy_safe

The trend continues here as well, and we can see increases of 3-4 GB/s in memory bandwidth of G.Skill mems. This also shows how loose the default timings are, but we cannot expect that it will directly translate into 10% performance improvement. Unfortunately, operating frequency of HyperX mems is too low to let it spread its wings.

SuperPi 32M

superpi32m_safe

SuperPi 32M shows some real performance improvement. B-die got down to 6:29 from 6:34, E-die – to 6:31 to 6:37 and MFR – to 6:33 from 6:40. Every second counts when fighting for optimal performance, but what we see here is that the test runs about 2% shorter.

This concludes first round of tests, where safe voltage of 1.4-1.45V is used. I was able to demonstrate how loose XMP timings are and what are possible gains in benchmarks. The real overclocking will start in next part, when I will raise voltage even more to 1.55V which in my opinion is just about maximum you could consider for 24/7 with proper cooling in your system.

Comparing DDR4 IC on Z170 platform – part 1 – intro and stock settings

Earlier this year I researched information for an article which will be published on overclock.pl anytime soon. The article is going to cover comparison of performance of DDR4 modules in synthetic benchmarks and their overclockability. For that purpose, I will use following benches in their default settings as per HWBOT rules:

  • SuperPi 32M
  • XTU – version 6.0.28
  • Geekbench – version 3.4.1 64-bit
  • AIDA – version 5.7.0

The test platform consists of air cooled setup of:

Component Model Settings
Motherboard Asus Maximus VIII Hero
CPU Intel Core i5 6600K 100 x 48 = 4800 MHz

1.28 vcore

Cache multi: 45

Cooling CPU: Scythe Mugen 3B + Arctic 120 mm

RAM + MOBO: Enermax TwisterStorm

PSU Corsair RM850
SSD Samsung 850 EVO 128 GB

OS used is Windows 7 SP1 64-bit. All settings are constant apart from RAM. Presented results will be average of 5 runs.

Memory kits used in the test and DDR4 ICs on them:

  • G.Skill TridentZ 3600C17 (2×8 GB) – Samsung B-die
  • G.Skill RipjawsV 3200C16 (2×4 GB) – Samsung E-die
  • HyperX Savage 2666C13 (2×4 GB) – Hynix MFR

Results on default settings

The results here are achieved using XMP profile in RAM SPD – XMP profile is loaded in BIOS, system is booted and off we go 🙂

XTU

xtu_default

XTU is most used benchmark for 2d ont HWBOT in last months, mostly due to fact that it’s bundled with Intel motherboards. It’s interesting to see that HyperX win here with much theoretically faster RipjawsV. This will lead into conclusion that G.Skill XMP profile contain quite loose timings to allow booting high speed on low volts on most motherboards.

Geekbench

geekbench_default

HyperX lose a lot here due to lower working frequency, which is clearly visible in “Memory” tests. This confirms the hypothesis that frequency is very important for DDR4

AIDA

Due to the fact that different scales are being used data is displayed in 2 charts:
aida_latency_default aida_read_write_copy_default

AIDA results confirm G.Skill advantage due to higher working frequency. Interesting thing is that latencies on both are similar, but B-die kit has higher operating frequency which translates into better bandwidth.

SuperPi 32M

superpi32m_default

SuperPi 32M results support previous observations – memory performance depends a lot on the frequency, and loose timings in G.Skill XMP profile make the chart a bit flat.

 

This concludes the test setup and baselining on stock settings. Next parts should be more interesting when we start overclocking and observe some patterns. I plan to divide it into several parts:

  • safe OC for 24/7 – with 1.45 volts
  • bench safe – with 1.55-1.7 volts range
  • full out – with 1.75 volts and above
  • summary to see scaling results

It was a lot of fun for me to prepare the results, but if you have some comments or questions, or just want to point my poor efficiency, feel free to drop me a note.

DDR 400+ MHz

This post will be devoted to memory overclocking of DDR SDRAM. This memory type was the most popular between 2002 and 2005, replacing SDRAM and introducing double data rate to improve bandwidth. As per JEDEC standards, there are four official DDR speeds:

  1. 100 MHz (200 MHz data rate) – effective bandwidth 1600 MB/s
  2. 133 MHz (266 MHz data rate) – effective bandwidth 2133 MB/s
  3. 166 MHz (333 MHz data rate) – effective bandwidth 2666 MB/s
  4. 200 MHz  (400 MHz data rate) – effective bandwidth 3200 MB/s

The effective bandwidth is calculated as speed * 8 (64 bits) * 2 (double data rate). This number can be also found on memory markings as PC rating, where it’s rounded to closest hundred (for 133 it’s PC-2100 and for 166 it’s PC-2700). Of course you can find other speed ratings which have been defined by manufacturers , maximum being 350 MHz (PC-5600) released by Patriot in limited numbers.

I will focus here on reaching maximum memory clock (suicide), no stability tests are going to be performed. As it turns out achieving 400 MHz memory clock (100% more than highest official speed) is possible, however several factors need to be considered:

  1. Motherboard
  2. CPU
  3. RAM
  4. BIOS settings
  5. Cooling
  6. OS and software.

At the moment of writing there were 18 official results of DDR memory clock result over 400 in the HWBOT database (however only one per person is displayed).
At the end of the post I am also sharing BIOS template that I use for memory binning, so I hope you can find it useful.

Motherboard

Proper motherboard choice is essential in every overclock. In this case I am going to use Socket 939 for Athlon 64 as Intel chipsets do not clock memory well enough. For that socket there is one ultimate motherboard to be used for overclocking – DFI LanParty NF4, preferred models are Ultra-D, SLI-D and SLI-DR. The Expert and Venus models could also be used, although they can exhibit compatibility issues with highly clocked RAM and might limit your OC.
Alternatively Socket 754 might be used, as it forces single channel memory, so the memory controller has less stress. Single channel is the way to go here.

CPU

Here the preferred CPU will be usual Athlon 64 with Venice core, so that 250 memory divider can be used. Opterons have worse IMCs (integrated memory controller) on the average, as well as San Diego cores. Newcastle and Winchesters don’t have 250 divider, so in that case CPU most be capable of high HTT in 1:1 with RAM.
Important – CPU must be tested for IMC capability as not all of them are capable of reaching 400 memory clock.

RAM

RAM must be selected carefully – it has to have Samsung TCCD or TCC5 memory chips, otherwise exceeding 300 MHz will be impossible without taking some extreme steps (voltage or cooling). You can identify the chips visually, if the RAM has no heatspreaders, or consult some of the RAM lists available online. Note that good PCB allows for better results, so look for non-reference PCB design (such as Brainpower).
My preferred models are A-Data Vitesta 566/600, G.Skill 4400/4800, Corsair 3200XL and Patriot 4800XBLK.
Note that you need high voltage supplied to the RAM to support the overclock. Safe values of voltage to be used 24/7 for TCCD/TCC5 chips are 2.7-2.9V, but in order to achieve highest clocks possible 3.3V will be used. Most of those chips stop scaling and start throwing a lot of errors after that mark, but there are some rare examples which can survive 3.5V and more.

BIOS settings

In terms of BIOS two things are important – BIOS version and settings. When it comes to version, you can use the latest or select one of those with memmory table for TCCD.
The memory will be configured using loosest timing possible.

Cooling

For cooling you have to remember two things:

  1. Keep the CPU warm.
  2. Keep the memory cool.

You might be tempted at first to cool the CPU as well, but AMD CPUs on Socket 939 exhibit cold bug which prevents them from operating above certain HTT. This HTT limit is related to CPU temperature and it goes lower with temperature drop, so in order to remove the limit it’s better to pump safe voltage (such 1.5-1.55V) and use air cooling.
For memory also some care is required, as Samsung TCCD have a cold bug, and most of them do not operate properly below 0 degrees Celsius. For most results it is sufficient to cool mems with a strong 120mm fan.

OS and software

Windows XP will be used, so if you have your favourite custom OS with most features removed, it will do great. You need two programs: CPU-Z for making setting validation and Clockgen for adjusting HTT settings (and you can always take a screenshot).

Let the fun part begin

The overclocking process will consist of two steps – first boot to OS, then slowly raise the clock and record the results.

End of part 1 🙂

BIOS template:

System
3000+
DFI NF4 SLI-D NF4LD329 bios
Matrox G550 PCI-E
120GB WD 1200JB
650W Silverstone SLI PSU

WinXP Pro SP3
NF4 6.66 chipset drivers

512MB Gskill PC4400LE

@6x330mhz boot at 3.3v

VID = 1.475
VID Special = Above Vid * 104%
chipset = 1.4v
LDT = 1.7v
LDT x2.5

CPC = Disabled
CAS Latency(CL) = 3
RAS to CAS(Trcd) = 7
Min RAS# Active time(Tras) = 15
Row Precharge Time(Trp) = 7
Row Cycle Time(Trc) = 22
Row Refresh Cycle Time(Trfc) = 24
Row to Row Delay(Trrd) = 7
Write Recovery Time(Twr) = 3
Write to Read Delay(Twtr) = 2
Read to Write Delay(Trwt) = 7
Refresh Period (Tref) = 3072
Odd Divisor Correct = Disabled
DRAM Bank Interleave = Disabled
Skew = Increase
Skew Value = 255
DRAM Drive Strength = Level 5 or 7
DRAM Data Drive Strength = Level 3
DDRAM Response Time (510 series bios setting) = FAST or Normal / (618 bios) = Normal, Fast, Fastest
Max Async Latency = 12ns
Read Preamble Time = 8ns
Idle Cycle Limit = 256
Dynamic Counter = Enabled
R/W Queue Bypass = 16x
Bypass Max = 7x
32 byte Granularity = Disable(8burst)