You're absolutely correct. The second study I referenced that Nature published simultaneously was interesting in large part due to the considerable variation they found in the Y chromosome. I'm not sure anyone predicted that normal and healthy Y chromosomes could differ by counts of millions of base pairs.
We'd thought for quite some time that we had a good handle on the total number of base pairs in the Y. The GRCh38 reference genome--in place for a decade now and used by FTDNA for yDNA work--has it pegged at 57,227,415 base pairs. The older version of the reference model that we still use for our common autosomal tests, GRCh37, turns out to have been closer with a count of 59,373,566 base pairs. The research by the Telomere-to-Telomere Consortium (T2T) found 62,460,029 base pairs.
But the truth is that we'd never been able to accurately sequence the whole chromosome before. The available technology didn't allow it, and we were sorta guessing by using combinations of other tools.
The FTDNA white paper from March 2019 about the Big Y-700 test gives you a good idea of the large area of the Y chromosome we couldn't sequence previously:
The problem with that region labeled "inaccessible" is that it's densely heterochromatic, or tightly packed and condensed, and highly repetitive. Our current methods of DNA sequencing require that the DNA be broken up into small chunks, and then the sequencing operation does multiple reads of those various bits, typically 30, 60 or 100 times. The data are then aligned sort of like a jigsaw puzzle so that the laboratory can tell which pieces go where.
Think of it as having a really long, Faulknerian sentence that someone has cut up into fragments, each only a few letters long. You have umpteen fragments because each individual letter has been read 30 times, but it appears in different places in each of the fragments. For example, in the phrase "umpteen fragments" you may have one sliver of paper that has "agm"; another that has "n f"; and another that has "een."
Your job is then to reassemble all those tiny pieces of paper to figure out exactly what Faulkner wrote. It's like a nightmare version of Wheel of Fortune, only worse. Because there are only four letters and, collectively, you'd have from about 498 million letters (248.96 million base pairs) on the longest chromosome, Chromosome 1, to 125 million letters (62.46 million base pairs) on the smallest chromosome, the Y.
True, in "shotgun" and short-read DNA sequencing you'd get hundreds of letters per individual segment, but with only those four letters you can imagine the conundrum if you had regions where long and multiple series of repetitions were present. What do you do if your read length is 500 letters and you have a chromosomal region that contains 900-letter sequences that repeat scores of times in a row?
In a nutshell, that's the problem the newest generation of hybrid long-read and nanopore sequencing solves.
There are highly repetitive areas like that found in chromosomes 1, 3, 9, 16, and 19, plus the entire short arms of the acrocentric chromosomes, 13, 14, 15, 21, and 22. If you use DNA Painter, for instance, you'll see some of these regions grayed-out in the map. But proportionate to size, no chromosome contains more of this mysterious area than the Y chromosome (I'm excluding the inactive X in women, called the Barr Body; that's a whole 'nother topic).
What the new sequencing of the Y chromosome gives us, for the first time, is a clear picture of what the entire chromosome looks like. For genealogy or investigative (forensic) genealogy, we just don't know if there's going to be much in the previously mysterious region that will be of use. Time will tell...but don't expect to see this type of sequencing technology available to us at the big testing/matching companies for a while, or even if it will be commercially feasible in the near term.
But to your point of variability, that's exactly why there's been a press to shift to a pangenomic model, and why the NCBI has indefinitely suspended the release of GRCh39. Over the entire genome, this "mystery area" accounts for around 8% to 9% of everything. The detail we're getting from the newest generation of sequencing technologies is showing us that humans have more genetic differences than previously thought.
Thus the move to go to a pangenome reference and stop using a single model for the global population...a model where the majority of the genetic data was obtained from one man who lived in Buffalo, New York; a man who happened to respond to a newspaper add about DNA testing for research.