BBO Discussion Forums: Zar points, useful or waste of energy - BBO Discussion Forums

Jump to content

  • 19 Pages +
  • « First
  • 11
  • 12
  • 13
  • 14
  • 15
  • Last »
  • You cannot start a new topic
  • You cannot reply to this topic

Zar points, useful or waste of energy New to the concept, does it help...

#241 User is offline   Zar 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 153
  • Joined: 2004-April-03

Posted 2005-September-07, 08:25

>
Do you bother to pay any attention to other people's work?
<

Nope ... why waste time reading when I can use it for writing? Plus, reading is a degrading activity – as if they are smarter than me ... You have to manage your time wisely and project some dignity ... (end-of-quote :-)

>
BUMRAP + 531 is based on
A = 4.5
K = 3
Q = 1.5
J = .75
T = .25
plus adding points for shortage...
>

Aha … 1.50, 0.75, 0.25 ... may be 0.07 for the 9? Makes sense ... fits the downhill slope ...


I’ll publish the detailed STD amounts point-by-point and all the rest of the stats actually, but here are the overall results for the Standard Deviation. Since all functions are Bell-shaped and peak at the Game level (an interesting finding by itself), we can present the peak only:

ZPR 0.93
ZPB 0.94
GP 0.96
BP 0.96
ZP3 0.98
LP 1.05
WTC 1.09
LTC 1.22
LTM 1.23


Losing Trick Count is by far the least accurate method, be it Classic or Modern (measured by the STD rather than IMP).

Again, I am preparing a detailed presentation and analysis and I’ll let you know when it is on the webpage.

ZAR
0

#242 User is offline   inquiry 

  • PipPipPipPipPipPipPipPipPipPip
  • Group: Admin
  • Posts: 14,566
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Amelia Island, FL
  • Interests:Bridge, what else?

Posted 2005-September-07, 08:37

hrothgar, on Sep 7 2005, 09:53 AM, said:

inquiry, on Sep 7 2005, 04:27 PM, said:

Well Richard, I don't want to get too techinical about this, but as I point out several times before BUMRAP + 531 and ZAR points for "honors" is essentially the same. What you say? Look for  yourself...

I don't want to say tysen has re-invented the Zar wheel, but the similarties here are very close. The intial evaluaton is almost IDENTICAL, the place where the difference comes into play is when fit is found.

Get a bloody clue Ben. You've drunk WAY too much of the Kool Aid...

First of all, lets consider the whole honor point count "issue": I'm well aware that both Zar Points and BUMRAP evaluate the relative strength of Aces/Kings/Queen/Jacks using a near identical ratio. Indeed, I've made posts in the past which stated directly that I suspected that the accuracy of ar's hand evaluation scheme was largely a function of this ratio. [Personally, I don't credit this "innovation" to either Zar or Tysen. The earliest reference that I've been able to track down is contained in "The Four Aces System of Contract Bridge" which dates back to the 1930s...]

If we remove the honor point count from Zar and BUMRAP, we're left with the question of how one should account for distribution. Zar advocates (a+:) + (a-d). BUMRAP +5/3/1 counts 5 points for a void, 3 for a singleton, and 1 for a doubleton. Guess what? Zar's system of accounting for distribution isn't as accurate as the 5/3/1 scale...

As for the claim that Tysen is "re-inventing" Zar's work. I recall when Zar originally started posting his work on the web. Remeber those days long ago when Zar was talking about "aggressive" hand evaluation and Tysen and I were trying to explain the concept of "accurate" hand evaluation. To the extent that anything has drifted back and forth, its been the fact that Zar has slowly started to use more reasonable metrics to evaluate his own work.

Nothing wrong with Kool-Aide, per se, as long as it Black Cherry...

I specically said, "I don't want to say" that tysen re-invented the Zar wheel... and I know the hcp + control count didn't start with ZAR (as he says so himself in his documents). I just wanted to point out two issues... first, BUMRAP honor points are identical (or essentially identical) to ZAR honor points. And there is VERY little difference between between 5+3+1 distributional points and ZAR distributional points.

But ZAR has stuck with his Distributional points, Tysen has evolved his. Now, again, if you subtract 8 Distributional points from ZAR's total, and compare 5+3+1 to ZAR points you will find the Difference is very small, essential 0 or 1 for virtually all normal distributions, and remember "1" is the full value for a random JACK... A few examples...

4333 = Zar = 0 (8-8), 5+3+1 = 0
4432 = ZAR = 2 (10-8), 5+3+1 = 1
4441 = ZAR = 3 (11-8), 5+3+1 = 3
5332 = ZAR = 3 (11-8), 5+3+1 = 2
5431 = ZAR = 5 (13-8), 5+3+1 = 4
5521 = ZAR = 6 (14-8), 5+3+1 = 5
6322 = ZAR - 5 (13-8), 5+3+1 = 4
6331 = ZAR = 6 (14-8), 5+3+1 = 5
6430 = ZAR = 8 (16-8), 5+3+1 = 7
7330 = ZAR = 9 (17-8), 5+3+1 = 8
7420 = ZAR = 10 (18-8), 5+3+1 = 9

As you can see, ZAR is "slightly" (by one "zar point" more aggressive than 5+3+1) on most hands. This is actually a function of the fact (as ZAR has pointed out) that 4333 is overevalauted at a count of 8. If you "use" 9 as the starting point (being worth -1 DP rather than 0 when corrected), and serves as a more accurate marker when doing these system comparision. (That is 4333 is not worth "8" or zero DP, it is worth -1... .. note to hannie, see why I pass some surprising 4333 hands?( making the methods of initial evalaution essential IDENTICAL).

So where you claim "Guess what? Zar's system of accounting for distribution isn't as accurate as the 5/3/1 scale", At least for the inital evalaution I claim that HAS TO BE entire BS. The reason being these methods are darn essentially identical for initial evaluation.

The place where the differences comes into play is in what ZAR terms the aggression and if I remember correctly the anti=aggression. That is, in re-evaluation. ZAR will start heaping extra points onto the hands if a FIT exist that it is not clear happens with 5+3+1 (no great write up of the method compared to ZAR where he goes into great details). Now, come on richard, admit when using BUMRAP 5+3+1, when you have a fit, you upevalaute something..... and when you have a misfit you down evaluate. We all do this "automatically" with or without a point scale telling us what to do. It depends upon our experience. I down evaluate hands with 4333 for instance, and those with misfits.

So what ZAR has done is to try to figure out how much to down-evaluate misftis (misfit points) and to up-regulate fits (fit points), and superfits (add misfit points rather than subtract them). That is to place a quantitative value on certain features. Now this is where ZAR and BUMRAP 531 as identified by Tysen begin to part company. Tysen adds a point for suits with two honor in them. ZAR adds one point for each honor in PARTNER suit (up to two), and one point for "concentrated" honors in two suits. BUMRAP is slighly more aggressive here on the majority of hands, but this is very small. But the real parting of company is the FIT (or lack of it) calculations that seem to be missing from the other method. Zar is hyper aggressive and hyper=conservative based upon fit or lack of it.

I don't know about you, but I have EXAMINED actual hands with my eyes (not software mathematical calculations)... I realize the sample size could be too small, but the plus evalaution and minus evaluation used by ZAR seems to be accurately reflect what one would do by "feel" and seems to greatly both improve the accuracy when fits and misfit occur.

Since BUMRAP 5+3+1 is (as I noted above) essential identical to initial ZAR evluation (especially if you consider 9 DP as the "base" zar DP schedule rather than 8, so as to properly devaluate the value of 4333), there can be little doubt in my mind that if BUMRAP doesn't include a re=evaluation tool based upon fit or no fit, it can not be as accurate as ZAR +/- fit points. Now, I maybe I have missed the re-evaluation tools with 5+3+1, if there are some, someone needs to explain.
--Ben--

#243 User is offline   hrothgar 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,488
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Natick, MA
  • Interests:Travel
    Cooking
    Brewing
    Hiking

Posted 2005-September-07, 08:43

Zar, on Sep 7 2005, 05:25 PM, said:

I’ll publish the detailed STD amounts point-by-point and all the rest of the stats actually, but here are the overall results for the Standard Deviation. Since all functions are Bell-shaped and peak at the Game level (an interesting finding by itself), we can present the peak only:

ZPR 0.93
ZPB 0.94
GP 0.96
BP 0.96
ZP3 0.98
LP 1.05
WTC 1.09
LTC 1.22
LTM 1.23

Comment 1: Could you provide definitions to accompany the acronyms... For example, I THINK that GP is Goren points and that BP are Binkie Points but some confirmation would be nice.

Comment 2: I don't see any calculations for BUMRAP + 5/3/1

Comment 3: If GP is "traditional "Goren", it seems strange that its scoring so well...
Alderaan delenda est
0

#244 User is offline   han 

  • Under bidder
  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 11,797
  • Joined: 2004-July-25
  • Gender:Male
  • Location:Amsterdam, the Netherlands

Posted 2005-September-07, 10:09


BUMRAP honor points are identical (or essentially identical) to ZAR honor points.


Of course, exactly identical, except for two things:

1) Zar honor points ignores 10's.

2) Zar points has to be normalized to compare with traditional HCP's, BUMRAP doesn't.

Of course, Zar is very much aware of this, and has done these to avoid fractions. I suspect that multiplying BUMRAP by 4 also avoids fractions.
Please note: I am interested in boring, bog standard, 2/1.

- hrothgar
0

#245 User is offline   han 

  • Under bidder
  • PipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 11,797
  • Joined: 2004-July-25
  • Gender:Male
  • Location:Amsterdam, the Netherlands

Posted 2005-September-07, 10:15

I’ll publish the detailed STD amounts point-by-point and all the rest of the stats actually, but here are the overall results for the Standard Deviation. Since all functions are Bell-shaped and peak at the Game level (an interesting finding by itself), we can present the peak only:

ZPR 0.93
ZPB 0.94
GP 0.96
BP 0.96
ZP3 0.98
LP 1.05
WTC 1.09
LTC 1.22
LTM 1.23


Losing Trick Count is by far the least accurate method, be it Classic or Modern (measured by the STD rather than IMP).

Again, I am preparing a detailed presentation and analysis and I’ll let you know when it is on the webpage.

ZAR

Mini ProfilePMEmail Poster
Top


Thanks, this is interesting indeed. I do hope that you will include more modern point counts besides Zar (try 18-12-6-3-1 for BUMRAP plus 20-13-4 for shortness if you want to avoid fractions).
Please note: I am interested in boring, bog standard, 2/1.

- hrothgar
0

#246 Guest_Jlall_*

  • Group: Guests

Posted 2005-September-07, 10:31

Hannie, on Sep 7 2005, 11:15 AM, said:

Thanks, this is interesting indeed. I do hope that you will include more modern point counts besides Zar (try 18-12-6-3-1 for BUMRAP plus 20-13-4 for shortness if you want to avoid fractions).

yikes... I can't add that high :)
0

#247 User is offline   inquiry 

  • PipPipPipPipPipPipPipPipPipPip
  • Group: Admin
  • Posts: 14,566
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Amelia Island, FL
  • Interests:Bridge, what else?

Posted 2005-September-07, 10:45

Hannie, on Sep 7 2005, 12:09 PM, said:

Of course, exactly identical, except for two things:

1) Zar honor points ignores 10's.

Well, ZAR doesn't ignore all ten's now does it? If you have a ten in partners suit, you can add 1 zar point (on the correct scale of an ACE being worth 1 point, a TEN in partners suit s worth 0.16666, whereas each TEN in bumrap is worth 0.056.

Of course in ZAR, if you have QJT, KJT, AJT, KQT, AKT in partners suit, the ten becomes valuableless again.

Now it is an academic exercise to determine if a random 10 is worth 1/3 the value of a ten in partner's suit. ZAR adds weight to the fittign TEN's BUMRAP adds less value to fitting tens, but adds values for all of them. In the scheme of things, the value added for a ten (0.25) is not enough to sway a decision I think, so you probably do as I do, and go that is "a" plus value, but what minuses do I have. To me, counting 10's in our suit, and ignoring 10's in other suits makes sense. Probably 1 ZAR point for such a TEN is a tad too much, at least I think so. So I ignore the point from a ten when not vul, and count it when vul.

Quote

2) Zar points has to be normalized to compare with traditional HCP's, BUMRAP doesn't.

Of course, Zar is very much aware of this, and has done these to avoid fractions. I suspect that multiplying BUMRAP by 4 also avoids fractions.


Not exactly sure why this is an issue. Telling your oppoents that you have 19 BUMRAP 531 points is hardly helpful if 5 of those are from controls and 4 from 531 additions. That means you 19 points is really only 10. Saying "19 with distribution" is not right either. Or you could you the more complicted scale (ACE worth 4.5, king 3, etc). Now you say 19 pts with discibution is closer for them, but still not as useful to them. Will they know your kings are aces are worth 4.5, your queens only 1.5, your jacks 0.75.. how are they to count your hands even after you disclosed.
--Ben--

#248 User is offline   tysen2k 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 406
  • Joined: 2004-March-25

Posted 2005-September-07, 10:45

You guys are all pretty much right. I just want to state that:

1. Zar, BUMRAP, and TSP all yield very similar results. That is because the main benefit from these methods comes from shifting the HCP values from a 4-3-2-1 ratio to a 3-2-1-0.5 ratio. The distribution is fairly minor compared to this.

2. Zar is a perfectly good system and much better than regular HCP. My single and only complaint about Zar has been that it's slightly more complicated to calculate than BUM+531 and it's not any more accurate. Plus it uses a completely different "scale" which some people don't want to use and makes it more difficult to explain to opponents.

3. The only reason I invented TSP was that I said to myself, "what is the most accurate point count method I can make using reasonably sized whole numbers?" And TSP came out of that. It's not that much better than Zar or BUM, but it's the best I could do. If you want a simple method that has the same scale, just use BUMRAP. After all, if you have this hand:

ATxxx
Axx
ATx
xx

It's much easier to explain that you've decided to upgrade this hand to 15 points rather than explain that you've got 29 Zar.

Tysen
A bit of blatant self-pimping - I've got a new poker book that's getting good reviews.
0

#249 User is offline   inquiry 

  • PipPipPipPipPipPipPipPipPipPip
  • Group: Admin
  • Posts: 14,566
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Amelia Island, FL
  • Interests:Bridge, what else?

Posted 2005-September-07, 11:03

tysen2k, on Sep 7 2005, 12:45 PM, said:

After all, if you have this hand:

ATxxx
Axx
ATx
xx

It's much easier to explain that you've decided to upgrade this hand to 15 points rather than explain that you've got 29 Zar.

I don't think you are REQUIRED to give your exact count (rather it is ZAR, Goren, Milton Work, or TSP). I think a general range is what is required. This hand would be described (no matter how you play it) as better than a minimum opening, or approxiametly a king better than a minimum opening, but that WE OPEN very light (as few as 8 hcp). In fact, I alert EVERY ROUND that I open very light.

Now, telling your opponents that you have 15 hcp when you hold 12 is likely to cause at the very least hard feelings.. "I would have bid if I had known he was so light" attitude. This is why I tell them I open very light (as few as 8 hcp) and then the range of the bid (better than minimum, not forcing, etc).

You might also try Marty's answer, Points? Smoints!
--Ben--

#250 User is offline   Zar 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 153
  • Joined: 2004-April-03

Posted 2005-September-07, 11:22

>
Comment 1: Could you provide definitions to accompany the acronyms... For example, I THINK that GP is Goren points and that BP are Binkie Points but some confirmation would be nice.
<

Sorry, forgot you don’t want to read the book – all the acronyms are from there (sicne I just cut and paste from the generated tables). GP is Goren Points, LP is Lawrence Points, BP is Bergen Points, ZPR is Zar Points with Ruffing, ZPB is Basic Zar Points, etc.

>
Comment 2: I don't see any calculations for BUMRAP + 5/3/1

Really? :-) Shall we talk about this again? You have to try reading, man :-)

>
If we remove the honor point count from Zar and BUMRAP, we're left with the question of how one should account for distribution. Guess what? Zar's system of accounting for distribution isn't as accurate as the 5/3/1 scale...
<

5-3-1 rocks, man.

It’s a good to idea to check out first though – just for yourself so you don’t get embarrassed in public. You can go to the website and check automatically the 7-4-1, 5-3-1, and 3-2-1. Yeah ... it will take some reading, sorry.

>
Zar has slowly started to use more reasonable metrics to evaluate his own work.
>

Slowly? As slow as my 3GZ computer is :-) I presented the IMP-based comparison, now the STD-based comparison. Anything else?

>
Comment 3: If GP is "traditional "Goren", it seems strange that its scoring so well...
<

To be honest, I was surprised myself. In fact, I am surprised by BOTH Goren and Bergen, since BOTH also peak for 10 tricks EXACTLY where they say (26 Goren and 40 Bergen). Aggressive methods like Lawrence and Zar peak at 11 while very-conservative ones like WTC and LTC peak at 9.

You’ll see all that in the document when I post it.

ZAR
0

#251 User is offline   mike777 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 16,826
  • Joined: 2003-October-07
  • Gender:Male

Posted 2005-September-07, 14:45

I mentioned earlier that I tried reading Mr. Zar but could not understand page one let alone the rest.

Am I alone on this site regarding not understanding this debate.

What is the hypothesis that is being tested, assuming there is one? If so could some one post it not only in Math terms but also in plain bridge english terms? Both would help me.
Thank you in advance.
0

#252 User is online   awm 

  • PipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 8,375
  • Joined: 2005-February-09
  • Gender:Male
  • Location:Zurich, Switzerland

Posted 2005-September-07, 15:04

Here's my understanding of the debate.

We would like to design a good hand evaluation measure.

The idea of such a measure is that you can look at your hand and compute some number. Your partner looks at his hand and computes some number. By combining these two numbers together, without any other information about the hands, we can decide whether we have game, whether we have slam, and so forth with a reasonable degree of accuracy.

Why do we want a hand evaluation measure?

The goal in bridge is to find your best contract. The problem is, you don't have enough bidding space to exactly describe every card to partner. Since I can't just lay my hand on the table and let partner pick a contract, I need to select a (relatively small) amount of data to communicate such that partner can do a reasonably good job.

Isn't this business about computing numbers an oversimplification?

Yes, obviously so. But people have historically used it as a starting point at the table. The first attempt at a hand evaluation system (that I know of) is the milton work point count (4-3-2-1, familiar to most of us) with 26 being enough for game. Once we agree on a basic system, we can start worrying about how to adjust the evaluation when some additional information about length of suits has been communicated as well.

So what's the debate about?

Zar has designed a method of hand evaluation which he believes to be good. This method counts as follows:

Start with your standard 4-3-2-1 point count. Add two points for each ace and one for each king (controls). Now add the sum of the lengths of your two longest suits. Now add the difference of your longest suit length and shortest suit length. This gives you the "number" described above.

Zar claims that you should bid game if your number plus your partner's is 52. Thresholds are also given for slam. Of course, like any system that doesn't take into account relative shapes, this is of limited accuracy. Zar has proposed fit/misfit points to adjust for that once the shapes are known.

The following questions are being addressed:

(1) Is Zar's evaluation scheme better than others in the literature? The other schemes include the 4-3-2-1 count, losing trick count, and various others. Zar has run through a very large library of hands, computing for each one the Zar count and comparing its "predicted number of tricks" to the actual number. His data suggests that his method outperforms the competing methods. Tysen mentions that Zar doesn't include the BUM point method (basically a 3-2-1-0.75 scale). He claims that the main reason Zar outperforms other methods is the reweighting of the honors (i.e. aces are underweighted in the 4-3-2-1 scheme and quacks overweighted) and that Zar's method of counting distribution by adding/subtracting suit lengths is actually less accurate than adding points for singletons and voids.

(2) Do the additional fit/misfit points Zar adds when distribution is known accurately reflect these sorts of features? Here less seems to be known; Tysen points out that Zar's scheme of adding fit/misfit points seem to weigh things differently depending on who opens (with two identical hands) which seems kind of odd.

(3) There is also some debate about what is simple to compute at the table, and what is reasonably simple to explain to opponents.
Adam W. Meyerson
a.k.a. Appeal Without Merit
0

#253 User is offline   tysen2k 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 406
  • Joined: 2004-March-25

Posted 2005-September-07, 15:29

Zar, on Sep 6 2005, 08:05 PM, said:

Superfit points are calculated straightforward – 0123 for the Zar Ruffing method and straight 3 for the ZP3. Obviously regardless of “opener” – since there is simply no opener.

Okay, then is this 2 or 4 superfit points if there is no opener?

xxxxx
x
xxx
xxxx

xxxxx
xx
xxx
xxx
A bit of blatant self-pimping - I've got a new poker book that's getting good reviews.
0

#254 User is offline   mike777 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 16,826
  • Joined: 2003-October-07
  • Gender:Male

Posted 2005-September-07, 15:38

awm, on Sep 7 2005, 04:04 PM, said:

Here's my understanding of the debate.

We would like to design a good hand evaluation measure.

The idea of such a measure is that you can look at your hand and compute some number. Your partner looks at his hand and computes some number. By combining these two numbers together, without any other information about the hands, we can decide whether we have game, whether we have slam, and so forth with a reasonable degree of accuracy.

Why do we want a hand evaluation measure?

The goal in bridge is to find your best contract. The problem is, you don't have enough bidding space to exactly describe every card to partner. Since I can't just lay my hand on the table and let partner pick a contract, I need to select a (relatively small) amount of data to communicate such that partner can do a reasonably good job.

Isn't this business about computing numbers an oversimplification?

Yes, obviously so. But people have historically used it as a starting point at the table. The first attempt at a hand evaluation system (that I know of) is the milton work point count (4-3-2-1, familiar to most of us) with 26 being enough for game. Once we agree on a basic system, we can start worrying about how to adjust the evaluation when some additional information about length of suits has been communicated as well.

So what's the debate about?

Zar has designed a method of hand evaluation which he believes to be good. This method counts as follows:

Start with your standard 4-3-2-1 point count. Add two points for each ace and one for each king (controls). Now add the sum of the lengths of your two longest suits. Now add the difference of your longest suit length and shortest suit length. This gives you the "number" described above.

Zar claims that you should bid game if your number plus your partner's is 52. Thresholds are also given for slam. Of course, like any system that doesn't take into account relative shapes, this is of limited accuracy. Zar has proposed fit/misfit points to adjust for that once the shapes are known.

The following questions are being addressed:

(1) Is Zar's evaluation scheme better than others in the literature? The other schemes include the 4-3-2-1 count, losing trick count, and various others. Zar has run through a very large library of hands, computing for each one the Zar count and comparing its "predicted number of tricks" to the actual number. His data suggests that his method outperforms the competing methods. Tysen mentions that Zar doesn't include the BUM point method (basically a 3-2-1-0.75 scale). He claims that the main reason Zar outperforms other methods is the reweighting of the honors (i.e. aces are underweighted in the 4-3-2-1 scheme and quacks overweighted) and that Zar's method of counting distribution by adding/subtracting suit lengths is actually less accurate than adding points for singletons and voids.

(2) Do the additional fit/misfit points Zar adds when distribution is known accurately reflect these sorts of features? Here less seems to be known; Tysen points out that Zar's scheme of adding fit/misfit points seem to weigh things differently depending on who opens (with two identical hands) which seems kind of odd.

(3) There is also some debate about what is simple to compute at the table, and what is reasonably simple to explain to opponents.

Wow another very clear, well written post. Thank you very much.

At the risk of getting swatted down let me try and start at the very beginning.
"the goal in bridge is to find your best contract" For sake of discussion "Hand evaluation" is a very important tool in achieving this goal. So Zar and others are devising "Hand evaluations" to achieve this goal.

Ok here I step out on a limb. I do not think "THE goal of bridge is to find your best contract" and therefore this debate is centered on an incorrect goal.

I would argue the goal of bridge is to WIN. What are the smaller goals we need to achieve to win is a good debate. One goal may be to find your best contract another may be to make life difficult for the opp. Perhaps hand evaluation should have other goals besides reaching the best contract, perhaps not.

In any event I do not see how you find the best "hand evaluation" method without a debate first on what the goal of hand evaluation should be first. Maybe it should be on finding the best contract maybe it should be something else?
0

#255 User is offline   Zar 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 153
  • Joined: 2004-April-03

Posted 2005-September-07, 15:56

>
I mentioned earlier that I tried reading Mr. Zar but could not understand page one let alone the rest.
<

Page one is the title, Mike – not much to understand there :-) I have lots of requests to provide a “short” and “no-data-please” presentation of the material. Even requests like “Zar, I unconditionally trust you about any information you present – no NEED to give me proofs and traces of changes of the data so I can even follow the tendency. Just strip the damn thing from any tables and numbers that make me dizzy and give me the conclusions for me to use”.

Mike, I believe you have the same type of request AND I am planning to do a “stripped-down-version” for people that do not need proof and trace of tendencies. That is a perfectly fine request and I will honor it.

>
Am I alone on this site regarding not understanding this debate? What is the hypothesis that is being tested, assuming there is one? If so could some one post it not only in Math terms but also in plain bridge english terms? Both would help me. Thank you in advance.
<

The general “Debate” is actually along the lines of a “pursuit for perfection” in the good sense of the word and as I have said several times “nobody’s perfect” so ... it boils down (apart from the pure Zar-Points-approach discussion) to trying to measure the methods used by “normal people” at the table and see which one makes more sense (and the meaning of that by itself is a partial matter of the debate).

It started with my attempts to test Zar Points against popular methods and see is it is worth even worth presenting it here and there. It measured “aggressiveness” only and I was rightly accused that this is just ‘one side of the coin”. Then I decided to make a “complete coverage” of the spectrum and test all the 3 important boundaries (Game, Slam, and Grand) each from BOTH sides of the fence – overbidding and underbidding. It was a “match” of all the 105,000 boards that have between 9 and 13 tricks in Spades (out of 1,000,000 boards) in the NS direction and see what happens. In my view that was the “ultimate test” that everyone would understand – measured in IMPs, everything on the table, nowhere to hide. Then the debate was pushed into the area you don’t like with all the Variances, Standard deviations, Means etc. (if you think you cannot understand the book, wait until I publish all the data and analysis from THIS exercise :-)

Is all this good? Absolutely. I think we all learn a lot in the process and the thread itself is very active which to me means that the stuff discussed is interesting.

ZAR
0

#256 User is offline   ochinko 

  • PipPipPipPipPip
  • Group: Full Members
  • Posts: 647
  • Joined: 2004-May-27
  • Gender:Male
  • Interests:Cooking

Posted 2005-September-07, 16:18

I am thankful for the works of Zar and Tysen, and happy that they are still researching and present here in the discussion.

I would like to add two more things. The first one should be fairly obvious. BUMRAP and ZAR evaluate all the honors (except for the Ten) in the same way. 4.5-3-1.5-0.75 has exactly the same ratio between the honors as 6-4-2-1. If you add 0.25 for the Ten the first one has a sum of 10 just like Milton Work's. The second one is easier to remember because you have to add the HCP to the values of high card controls as counted in the Blue Club (A=2, K=1).

The other thing (which would be quite a relief for people that don't want to be bothered with new evaluation schemes) is that another researcher (Thomas Andrews - http://bridge.thomas...com/valuations/) found out that on NT contracts BUMRAP doesn't perform any better than Work's point count. He proposes his own evaluation method for NT (A=4, K=2.8, Q=1.8, J=1, T=0.4), which he calls "fifths" because you take one fifth of a point from Kings and Queens, and give two fifths to the Ten.

Petko
0

#257 User is offline   tysen2k 

  • PipPipPipPip
  • Group: Full Members
  • Posts: 406
  • Joined: 2004-March-25

Posted 2005-September-07, 17:13

To copy the metric being developed in the other thread on DD evaluations (which is starting to generate some interesting discussion), let's see how it applies to shapes as a whole.

Given the shape of our hand, what is the chance that we have a game?

Shape    Game?  531    Zar
4-3-3-3   25%
4-4-3-2   26%
5-3-3-2   27%
5-4-2-2   28%
6-3-2-2   30%
4-4-4-1   32%   <--
5-4-3-1   33%   <--
7-2-2-2   34%   <--    <--
6-3-3-1   34%   <--    <--
6-4-2-1   37%
5-5-2-1   38%          <--
7-3-2-1   38%
5-4-4-0   45%          <--
5-5-3-0   46%
6-4-3-0   47%
6-5-1-1   49%
7-4-1-1   50%
6-5-2-0   54%
7-4-2-0   55%


The first set of arrows shows hands that a 5-3-1 system counts as "equivalent." The second set of arrows shows some hands that Zar counts as equivalent. Which set looks more tightly clustered to you?
A bit of blatant self-pimping - I've got a new poker book that's getting good reviews.
0

#258 User is offline   hrothgar 

  • PipPipPipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 15,488
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Natick, MA
  • Interests:Travel
    Cooking
    Brewing
    Hiking

Posted 2005-September-07, 20:25

Zar, on Sep 7 2005, 08:22 PM, said:

It’s a good to idea to check out first though – just for yourself so you don’t get embarrassed in public. You can go to the website and check automatically the 7-4-1, 5-3-1, and 3-2-1. Yeah ... it will take some reading, sorry.

I have a better idea... BUMRAP + 5/3/1 is one of the distributions that people are most interested in. Wouldn't it make sense to bother to post the results rather than forcing folks to wade through your web site?

I'm still a bit confused regarding the accuracy of the Goren 4/3/2/1 point count...
When Tysen provided standard error calculations for a variety of hand evaluation metrics he posted the following data:

R2 Standard Error
Zar + fit 0.74 1.05
HCP 0.65 1.21

It might be worthwhile to try to reconcile the difference...
Alderaan delenda est
0

#259 User is offline   inquiry 

  • PipPipPipPipPipPipPipPipPipPip
  • Group: Admin
  • Posts: 14,566
  • Joined: 2003-February-13
  • Gender:Male
  • Location:Amelia Island, FL
  • Interests:Bridge, what else?

Posted 2005-September-08, 07:54

tysen2k, on Sep 7 2005, 07:13 PM, said:

To copy the metric being developed in the other thread on DD evaluations (which is starting to generate some interesting discussion), let's see how it applies to shapes as a whole.

Given the shape of our hand, what is the chance that we have a game?

Shape [space] [space]Game? [space]531 [space] [space]Zar
4-3-3-3 [space] 25%
4-4-3-2 [space] 26%
5-3-3-2 [space] 27%
5-4-2-2 [space] 28%
6-3-2-2 [space] 30%
4-4-4-1 [space] 32% [space] <--
5-4-3-1 [space] 33% [space] <--
7-2-2-2 [space] 34% [space] <-- [space] [space]<--
6-3-3-1 [space] 34% [space] <-- [space] [space]<--
6-4-2-1 [space] 37%
5-5-2-1 [space] 38% [space] [space] [space] [space] [space]<--
7-3-2-1 [space] 38%
5-4-4-0 [space] 45% [space] [space] [space] [space] [space]<--
5-5-3-0 [space] 46%
6-4-3-0 [space] 47%
6-5-1-1 [space] 49%
7-4-1-1 [space] 50%
6-5-2-0 [space] 54%
7-4-2-0 [space] 55%


The first set of arrows shows hands that a 5-3-1 system counts as "equivalent."  The second set of arrows shows some hands that Zar counts as equivalent.  Which set looks more tightly clustered to you?

I am not exactly sure you want to compare ZAR initial evaluation to BUMRAP 5+3+1 using the data you created in the other thread, or BUMRAP will suffer by the comparison. But since you do, ok.. here goes.

First the table you show here is if GAME can be made. This depends upon satistical evaluation of the chance of fit, whether the fits found are in the major or not, etc. For example, in the table you quote so happily that BUMRAP 531 bundles numbers close together, you ignore the fact that the bundling in game is a mixture of shapes.. take the 5440 shape, you quote it as 45%. When in fact, your data suggest it is either 49% (if both majors) or 42% (if 54 is in the minors). As you correctly point out in the other thread, the difference here is related to the fact that with 54 in the minors, if you have a minor fit you need to take one more trick. But even when 54 in the major, some percent of the hands will make game in the suit with the four card major (or NT). So this is not an easy evalaution to make. And BUMRAP 531 and ZAR seperate paths once the bidding has progressed, ZAR will, if fit is found, with a 5440 hand ballon up by at least 3 more points, and possibly by 9. That is between 0.5 and 1.5 additional tricks. And if no fit is found, ZAR evalation on this hand might shrink. So for "GAME" evalation, you have to take into account statisitical probabilties of fits, and what affect such fits would have on the evaluation of these patterns (Alone that is interesting enough, I have toyed with it using your data to see if ZAR's "correction" factors for FIT and MISFIT are close).

But instead lets just deal with the concept of "DISTRIBUTIONAL POINTS" from the hand patterns in isolation. To do so, don't use the GAME % data (which is a conconction of major versus minor, fit versus no-fit). Let;s just take the overall trick taking potential into account. To do this, we make the following assumptions (I like it when assumptions are given so all can agree or disagree). Bumrap distribiton is +1 for each card over four in a suit, +1 for doubleton, +3 for singleton and +5 for void. ZAR BASIC distribution point is twice the longest suit, plus the difference between the second longest and the shortest. For ZAR points, one trick is worth FIVE POINTS, for BUMRAP 5+3+1 one trick is worth 2.5 points.

With these assumptions. we further agree that 4333 pattern is the base hand pattern. For BUMRAP 531 this is worth 0 Distributional points. For ZAR, this is worth 8. For calculation purposes, we will subtract 8 ZAR points from this pattern and all other patterns. This is to determine how many ZAR points more (the trick taking potential) that hand pattern is compared to the worse holding. For BUMRAP we will calculate the points as above. Futher, we will divide the ZAR points (minus the base of 8) by five to determine the "trick taking potential" of the hand pattern. We will divide BUMRAP 5+3+1 by 2.5 to determine the same value. To detemine the number of tricks, we then add the "trick corrected" values for ZAR points and Bumrap 5+3+1 to 7.8 (the number of tricks taken with 4333 hands) and compare the results with the observed trick taking potential of each distribution.

The number of tricks present in each hand pattern will be the number you determined on your investigation. This is wthout regard to the which suits have the pattern. Here is the data....

Pattern[space][space][space][space]Trick[space][space][space][space]+trk[space][space][space][space]ZAR[space][space][space][space]Ztri[space][space][space][space]Bum[space][space][space][space]Btric
4-3-3-3[space][space][space][space]7.80[space][space][space][space]0.00[space][space][space][space]8.00[space][space][space][space]0.00[space][space][space][space]0.00[space][space][space][space]0.00
4-4-3-2[space][space][space][space]8.09[space][space][space][space]0.29[space][space][space][space]10.00[space][space][space][space]0.40[space][space][space][space]1.00[space][space][space][space]0.20
5-3-3-2[space][space][space][space]8.14[space][space][space][space]0.34[space][space][space][space]11.00[space][space][space][space]0.60[space][space][space][space]2.00[space][space][space][space]0.80
5-4-2-2[space][space][space][space]8.41[space][space][space][space]0.61[space][space][space][space]12.00[space][space][space][space]0.80[space][space][space][space]3.00[space][space][space][space]1.20
6-3-2-2[space][space][space][space]8.51[space][space][space][space]0.71[space][space][space][space]13.00[space][space][space][space]1.00[space][space][space][space]4.00[space][space][space][space]1.60
4-4-4-1[space][space][space][space]8.62[space][space][space][space]0.82[space][space][space][space]12.00[space][space][space][space]0.80[space][space][space][space]3.00[space][space][space][space]1.20
5-4-3-1[space][space][space][space]8.69[space][space][space][space]0.89[space][space][space][space]13.00[space][space][space][space]1.00[space][space][space][space]4.00[space][space][space][space]1.60
6-3-3-1[space][space][space][space]8.78[space][space][space][space]0.98[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]5.00[space][space][space][space]2.00
7-2-2-2[space][space][space][space]8.91[space][space][space][space]1.11[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]2.40
6-4-2-1[space][space][space][space]9.02[space][space][space][space]1.22[space][space][space][space]15.00[space][space][space][space]1.40[space][space][space][space]6.00[space][space][space][space]2.40
5-5-2-1[space][space][space][space]9.03[space][space][space][space]1.23[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]2.40
7-3-2-1[space][space][space][space]9.14[space][space][space][space]1.34[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]6.00[space][space][space][space]2.40
5-4-4-0[space][space][space][space]9.38[space][space][space][space]1.58[space][space][space][space]14.00[space][space][space][space]1.20[space][space][space][space]6.00[space][space][space][space]2.40
5-5-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]15.00[space][space][space][space]1.40[space][space][space][space]7.00[space][space][space][space]2.80
6-4-3-0[space][space][space][space]9.51[space][space][space][space]1.71[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]7.00[space][space][space][space]2.80
8-2-2-1[space][space][space][space]9.57[space][space][space][space]1.77[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]3.60
6-5-1-1[space][space][space][space]9.61[space][space][space][space]1.81[space][space][space][space]16.00[space][space][space][space]1.60[space][space][space][space]9.00[space][space][space][space]3.60
7-3-3-0[space][space][space][space]9.65[space][space][space][space]1.85[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]8.00[space][space][space][space]3.20
7-4-1-1[space][space][space][space]9.67[space][space][space][space]1.87[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]3.60
8-3-1-1[space][space][space][space]9.83[space][space][space][space]2.03[space][space][space][space]18.00[space][space][space][space]2.00[space][space][space][space]10.00[space][space][space][space]4.00
6-5-2-0[space][space][space][space]9.88[space][space][space][space]2.08[space][space][space][space]17.00[space][space][space][space]1.80[space][space][space][space]9.00[space][space][space][space]3.60
7-4-2-0[space][space][space][space]9.89[space][space][space][space]2.09[space][space][space][space]18.00[space][space][space][space]2.00[space][space][space][space]9.00[space][space][space][space]3.60




A quick examination of this table shows that ZAR Basic point initial evalation is much closer to trick taking potential of the VAST majority of the hands than bumrap +531. In fact, bumrap estimate of the number of tricks is closer to the actual number of tricks on only one hand pattern 4432, where it esitiamte 0.2 tricks, while ZAR estimated 0.4 tricks. The pattern was worth 0.29 tricks so ZAR was off by 0.11 tricks, bumrap by 0.09 tricks. And of course, both methods by definition tied for 4333 patterns. In all over cases, ZAR is closer, and usually much closer to the trick taking potential.

For instance if we examine the last row of data, we find that 7-4-2-0 is 18 zar points (7*2 + 4 -0). When we subtract the base of 8 this is 10 ZAR points above the base distribution. 10 Zar points is 2 tricks (10/5 = 2). So when we add 2 to the base trick of the 4333 hand, we calculate this as 9.8 tricks (7.8 for 4333 plus two is 9.8). The actual tricks you calculated for 7420 was 9.89. So ZAR base estimate is "off" by 0.09 tricks. While BUMRAP + 5+3+1 would add 3 points for the 7 card suit, 1 point for the doubleton and 5 points for the void. That totals 9 points. Then we divide the 9 point by 2.5 to discover that BUMRAP 5+3+1 predicts this hand is worth 3.6 tricks more than the base hand pattern. So it over estimates by slighly more than 1.5 tricks the "power of this hand".

Anyone looking at the column of number will see that for "initial" evaluation of the hand patterns for trick taking potential, ZAR BASE points handily does a much better job than 5+3+1.

Now if you go back to the old standard 3+2+1 which you abandoned, it would work better for you, but you realize from your own study that 321 is not aggressive enough when it comes to biddign game and slams. So why is it that 531 is "better" when clearly it over-estimates the power of the hand pattern in isolation (as shown here by your own data)?

The answer is clear and which is why ZAR method works better, and it can be addressed in looking at the power of such odd little hand patterns as 5440. A 5440 hand pattern is worth "only" 14 zar (about 1.2 tricks) while BUMRAP 531 evaluates this as 6 points (or 2.4 tricks). 5440 is Way up on the game taking potential list. Much higher than reflected by the 1.2 trick shown by ZAR. But this is a statistical probability thing. If you are 5440 what are the changes you have an 8 card or better fit in one of you suits. If you find a fit, ZAR will automatically add from 3 to 9 more points to this hand, lets say an average of six. So he jumps up by another trick to the 2.2 range.And if a superfit is found then with this distribution it can be even more. So with ZAR, when a fit is found, hands like 5440 can upgrade dramatically. Consider 7420. This pattern is worth 18 zar (2 tricks). But if you open the sevn card suit and partner raises, you gain 6 more points for the void and 1 point for the doubleton (although superfit points might be more valuable) and you 2 trick evalation becomes worth more than 3. There is a similiar up evalation if you partner bids your four card suit.

So it turns out ZAR is "as aggressive" and in fact, more aggressive than 531 when fits are found, but are (excuse me for this), safer (sane?) when no fit is found or a misfit exist.

So the question for the readers and the method developers is, is it better to stetch the intial evaluation by using 531 (instead of 321 which more accurately reflect the trick taking potential of these patterns) so as to "guess" at the potential if a fit is found, or is it better to start with an accurate estimate of the potential of each hand, and then "Adjust" as fits are found.

A case in point, the lowly 4441 hand. This is worth only 11 zar points (11-8 = 3 or 1/2 trick), but also 3 bumrap points, or slightly more than a trick. But notice what happens when the fit is found (not a sure thing, but a statistical probability). Now zar will add 2 points for the singleton, so his method approaches 1 full trick. And if this hand opened, there is a great chance it will have an honor (A, K, Q, J, T) in the suit fit, for another point or two. So ZAR becomes the same as BUMRAP distributionally here. In fact, when you see this hand and count your 11 zars, you can almost mentally add another two full ZAR points, becasue better than 9 times out of 10 you will get those.

In fact, the "Statistical" chances for a fit make looking at which hand patterns are out of line with ZAR and BUMRAP 531 in the game percentage calculations make for some interesting evalations. If you are 5440 for instance, would you stick with the mear 14 ZAR count? What is the statistical probability you will have instead of 14, 17 or 20 ZAR points. The chances are fairly good. 20 ZAR points is 2.5 tricks, BUMRAP 531 seems to assume fit, as it counts this hand as 2.4 distributional tricks from the start. So could the real difference between ZAR and bumrap 531 be how well, statistically bum rap predicts fits? If fit exsit, bumrap is ok? Something to ponder.

Ben
--Ben--

#260 User is offline   cherdano 

  • 5555
  • PipPipPipPipPipPipPipPipPip
  • Group: Advanced Members
  • Posts: 9,519
  • Joined: 2003-September-04
  • Gender:Male

Posted 2005-September-08, 08:56

Ben, first about some naming confusion: BUMRAP is just the 4.5-3-1.5-0.75-0.25 hcp scale, no distribution. BUMRAP + 531 is this plus 5 for void, 3 for single, 1 for doubleton (no points for lengths). If you do BUMRAP +531 + points for lengths you completely overvalue distribution.
Then there is TSP, which uses IIRC the same point scale for honors as Zar, and 531+length points for distribution (plus other stuff). So what your table shows are the TSP distribution points, not BUMRAP.

Where did you take this number 2.5 from, by which you divided the TSP distribution points? It seems completely random, and one look at your table confirms that it is non-sense. I think you should use 5 if I understand Tysen's RGB post correctly. This also makes sense since, as you have pointed out a couple of times, TSP distribution points and Zar points are pretty close.

Arend
The easiest way to count losers is to line up the people who talk about loser count, and count them. -Kieran Dyke
0

  • 19 Pages +
  • « First
  • 11
  • 12
  • 13
  • 14
  • 15
  • Last »
  • You cannot start a new topic
  • You cannot reply to this topic

3 User(s) are reading this topic
0 members, 3 guests, 0 anonymous users