DrayTek UK Users' Community Forum

Help, Advice and Solutions from DrayTek Users

Intermittent Packet Loss to internal interface on New 2862ac

More
27 Apr 2020 12:00 #7 by admin
I'm not sure how to get speedtest.net to report packet loss and I don't know how the OP is measuring/detecting his packet loss

- Vigor 2862n here on BT VDSL. Firmware 3.9.2
- Tested with no-one else was using my Internet.
- Wired Ethernet
- Closeby host on Internet used for pings as well as internal interface
- MTU 1500
- Good quality line (79/19 Mb/s, SNR 9/15dB, Atten 8dB)

I don't see any packet loss.

If your unit has been replaced, we could assume that our units are materially similar so any difference in results has to be
down to environment - the ISP, the line, the measurement method, router settings etc. This is especially the case since the problem is intermittent - i.e. one of you said it was replaced, it was fine for a while and then not - the hardware/firmware didn't change - something else did (that doesn't mean it's not a router issue, as whatever changed may be the catalyst for a router problem). Also, if you've installed hundreds of these and this is the first one with a problem, that also leans to an environmental trigger.

So, whilst this might not seem helpful, a pragmatic approach to try and isolate the cause.

As you've said the problem is to the internal interface (e.g. 192.168.1.1 not the WAN side) then try simplifying as much as possible. Literally a factory reset router (backup config before you reset), NO internet setup and just PC to router with direct Ethernet cable. Does it still occur ?

n.b. I wouldn't rely on packet loss reported by an ISP if you have DoS defences enabled - those will specifically (and deliberately) disrupt external poking. Even if I had found a problem, my Internet (from an application PoV) works fine.



Forum Administrator

Please Log in or Create an account to join the conversation.

More
27 Apr 2020 12:36 #8 by gracecourt

admin wrote:
I'm not sure how to get speedtest.net to report packet loss but I have a Vigor 2862n here (on BT VDSL). Firmware 3.9.2

I tested when no-one else was using my Internet. I don't see any packet loss.

I'm doing all tests on wired Ethernet and my ping host is as local as I can find, which excludes any problems further away or unrelated.

If your unit has been replaced, we could assume that our units are materially similar so any difference in results has to be
down to environment - the ISP, the line, the measurement method, router settings etc. This is especially the case since the problem is intermittent - i.e. one of you said it was replaced, it was fine for a while and then not - the hardware/firmware didn't change - something else did (that doesn't mean it's not a router issue, as whatever changed may be the catalyst for a router problem).

So, whilst this might not seem helpful, a pragmatic approach to try and isolate the cause.



All very fine and dandy, but there's ample evidence of the fault. Basic fault-finding confirms it - swapping the faulty router for a different one, using the same DSL lead and power supply, restores the Firebrick monitoring to the normal state (no packet loss or latency > 1s), but replacing the faulty router re-introduces the fault state (packet loss and latency > 1s). This latest router worked - perfectly normally - for less than two months. This is the third failed 2862ac I've had in less than two years, and it was the warranty replacement for a 2862ac that failed after a year. I've rejected this one under Section 9 Consumer Rights Act 2015 (not of "satisfactory quality"). I'm waiting for the retailer to confer with Bonus Limited ("SEG") to confirm that they'll issue a refund - I've had it with this model - but, if not, this is going to end up in the County Court.

admin wrote:
n.b. I wouldn't rely on packet loss reported by an ISP if you have DoS defences enabled - those will specifically (and deliberately) disrupt external poking. Even if I had found a problem, my Internet (from an application PoV) works fine.


A FireBrick monitoring appliance is a superb way of monitoring packet loss, and it won't activate any good-quality DoS defence: the polling - to check that the DSLAM is synchronised and the PPPoA session is still up - is only once per second. The ISP doesn't "report" it, a professional broadband connection gives full monitoring and control over the line. For example, the user can initiate Openreach line tests and change the ADSL or VDSL configuration on the DSLAM.

Please Log in or Create an account to join the conversation.

More
27 Apr 2020 14:27 #9 by admin

gracecourt wrote:
All very fine and dandy, but there's ample evidence of the fault. Basic fault-finding confirms it - swapping the faulty router for a different one
......This latest router worked - perfectly normally - for less than two months.



I think you misunderstood my point about a specific model and specific environments so it isn't "all fine and dandy" - it's directly relevant because your problem is not the normal experience which means there's likely something environmental causing the misbehaviour - we just don't know what yet. Also, it's unclear what testing period you're applying to each variation of hardware. e.g. when the problem occurs, is it continuous and for what period...does it stop if the unit is power cycled etc.

I've rejected this one under Section 9 Consumer Rights Act 2015 (not of "satisfactory quality").
.... if not, this is going to end up in the County Court.



If you've given up and don't want to listen to technical suggestions then you're wasting your/our time here. You're in the best position to diagnose the issue and eliminate factors. It is extremely unlikely that hardware just 'goes faulty' like that.

Also, your issue is different to that reported by the OP as he referred to pinging the LAN interface.



Forum Administrator

Please Log in or Create an account to join the conversation.

  • hornbyp
  • User
  • User
More
27 Apr 2020 15:04 #10 by hornbyp

SimonSaysBoo wrote:
I'm seeing a very similar (sounding) issue to this on my 2862Ln. Intermittant high packet loss (as reported by Speedtest.net) over 5% - 40%.



A year or two back, I got involved with a similar issue with Virgin Media's Hub 3 - which was producing Thinkbroadband graphs which were solid yellow. Once of the common retorts at the time, was that it was just ICMP, which were low priority and so didn't matter :roll: ...

I used the Windows NTP utility to demonstrate that it applied to UDP as well - even when all the traffic was confined to Virgin Media's network.

I used:
Code:
w32tm /stripchart /computer:NTPserverofyourchoice /dataonly /packetinfo /period:1 /samples:n

Which gives something like:
Code:
C:\>w32tm /stripchart /computer:ntp0.zen.co.uk /dataonly /packetinfo /period:1 /sampl Tracking ntp0.zen.co.uk [212.23.8.6:123]. Collecting 1 samples. The current time is 27/04/2020 14:51:58. 14:51:58, +00.3149889s [NTP Packet] Leap Indicator: 0(no warning) Version Number: 1 Mode: 4 (Server) Stratum: 2 (secondary reference - syncd by (S)NTP) Poll Interval: 3 (8s) Precision: -23 (119.209ns per tick) Root Delay: 0x0000.0049 (+00.0011139s) vvvvvvv SCROLL DOWN! vvvvvv Root Dispersion: 0x0000.05DE (0.0229187s) ReferenceId: 0xC342F102 (source IP: 195.66.241.2) Reference Timestamp: 0xE25161626ED7DA8A (153153 13:49:22.4329812s - 27/04/2020 14:49:22) Originate Timestamp: 0xE25161FEBADBC309 (153153 13:51:58.7299158s - 27/04/2020 14:51:58) Receive Timestamp: 0xE25161FF0D664C0B (153153 13:51:59.0523422s - 27/04/2020 14:51:59) Transmit Timestamp: 0xE25161FF0D6922E4 (153153 13:51:59.0523855s - 27/04/2020 14:51:59) [non-NTP Packet] Destination Timestamp: Roundtrip Delay: 14874900 (+00.0148749s) <<<********************** Local Clock Offset: 314988900 (+00.3149889s)


I wrote a 'sensor' for PRTG and parsed out those "Roundtrip Delay: 14874900 (+00.0148749s)" lines, averaged over multiple samples. (There's a bug in Microsoft's W32tm utility - Roundtrip Delay: is presumably supposed to be on its own line :wink: )

Needless to say, the fault (which they did eventually fix) was manifesting itself in UDP as well.

Please Log in or Create an account to join the conversation.

More
27 Apr 2020 15:55 #11 by gracecourt

admin wrote:
I think you misunderstood my point about a specific model and specific environments so it isn't "all fine and dandy" - it's directly relevant because your problem is not the normal experience which means there's likely something environmental causing the misbehaviour - we just don't know what yet. Also, it's unclear what testing period you're applying to each variation of hardware. e.g. when the problem occurs, is it continuous and for what period...does it stop if the unit is power cycled etc.



I've nothing to hide, in fact I'm interested in how you explain the following as "...likely something environmental".
First 2862ac - After several months of operation in environmental conditions within the specified limits for this equipment (I'm trying to save you any more requests for more information) - became completely inoperable, power cycling constantly such that it couldn't even be re-flashed or re-configured... returned to retailer who didn't RMA it or respond to messages for over a month: a second 2862ac was purchased when the retailer refused a Section 9 refund and at a Part 27 hearing before a District Judge, I was awarded judgement.

Second 2862ac - After just under one year of problem-free normal operation as above, failed as set out in the third post above. RMA'd direct to Bonus Ltd and it was returned in full working order... but it subsequently transpired that they "repaired" it by replacing the router PCB so only the original wi-fi card was returned in the "new" unit - it's not known if the replacement PCB was "new", but to be frank it didn't concern me as it was fully functional... no WAN packet loss, no LAN packet loss, no abnormal latency.

"Third" 2862ac - Less than two months later, drop-outs on speech in VOIP calls alerted me to check the FireBrick monitoring. Sure enough, just after 14h00 on Tuesday 31 March the Firebrick graphs showed the return of WAN packet loss and latency > 1s as described in the fifth post above. An RMA request was made, attaching the WAN monitoring and showing that after the issue was constantly present for over a week, the WAN packet loss and latency issue disappeared for the ~2 hours that another router was in situ, and immediately returned when the 2862ac was replaced.

Continued in next post > > >

Please Log in or Create an account to join the conversation.

More
27 Apr 2020 16:01 #12 by gracecourt
But two weeks after the RMA request, and after providing a copy of the config to Bonus Ltd because they seemed to think that configuration corruption was causing this, an RMA still hadn't been approved. This is not why I paid the £270+ for the router, on top of the cost of the RMA on each occasion, I believe that I have been more than patient in trying to resolve these issues, and as already stated, a District Judge awarded judgement to me on a Section 9 claim on the first router failure, so I'm fairly confident of repeating that success on the third!

admin wrote:
If you've given up and don't want to listen to technical suggestions then you're wasting your/our time here. You're in the best position to diagnose the issue and eliminate factors. It is extremely unlikely that hardware just 'goes faulty' like that.



So, two completely different types of failure and the first and the third failures occurred after a number of months of perfectly normal operation. Unlikely, yes. But in no way impossible, and quite likely if there is a design problem or a build quality issue. I won't hazard an estimate of the probability, but it would need some data relating to the RMA or failure rate of this specific model, as I still have a 15-year-old Vigor 2800 providing an "always on" WDS wireless bridge that has never been a problem, and another similar 2800 to swap in as a replacement if it ever does! Yet modern routers costing over £270 are failing in different ways, all well within warranty...

admin wrote:
Also, your issue is different to that reported by the OP as he referred to pinging the LAN interface.



In my first post, I included a PingPlotter trace to the LAN interface that did indeed clearly display abnormal packet loss on that interface as well. Perhaps you can clarify how that differed from that reported by the O/P. It might well be different, I'm just interested to know why you believe it to be different.

Really... I am!

Please Log in or Create an account to join the conversation.