Jump to content

Module talk:TableTools: Difference between revisions

From Fifth Empire Wiki
No edit summary
Line 60: Line 60:
:: There's also no such ideom of "NaN" in Mediawiki parameters. Given that Lua already states that multiple NaN values compare effectively as "equal", it makes no sense to keep these duplicates which occur occasionally as result of computed numeric expressions (numbers in Lua never distinguish any NaN, neither does MediaWiki). You "idiom" is that used in Java or C++, which is NOT directly usd in MadiaWiki or Lua (hidden in the underlying internal representation). Keeping these duplicates should complicates situation where we expect "keys" in Lua tables to be unique. Not doing that means that we'll not properly detect overrides, and it will be impossible to map any action of value to NaN, meaning that we'll always get unspecified results that we can never catch in MediaWiki templates (this means undetected errors and unspecified behavior, wrong results in all cases). Lua treats all NaN as unique values which compare equal between each other, but different from all other non-NaN values (including "nil"/unmapped). Note that "nil" in (which matches unspecified MediaWiki values) is NOT "NaN" and is also NOT an "empty string" value (which is only one of the possible default values for unspecified but cannot match any "NaN"). "NaN" in Mediawiki is a string, and does not match the Lua NaN numeric value.
:: There's also no such ideom of "NaN" in Mediawiki parameters. Given that Lua already states that multiple NaN values compare effectively as "equal", it makes no sense to keep these duplicates which occur occasionally as result of computed numeric expressions (numbers in Lua never distinguish any NaN, neither does MediaWiki). You "idiom" is that used in Java or C++, which is NOT directly usd in MadiaWiki or Lua (hidden in the underlying internal representation). Keeping these duplicates should complicates situation where we expect "keys" in Lua tables to be unique. Not doing that means that we'll not properly detect overrides, and it will be impossible to map any action of value to NaN, meaning that we'll always get unspecified results that we can never catch in MediaWiki templates (this means undetected errors and unspecified behavior, wrong results in all cases). Lua treats all NaN as unique values which compare equal between each other, but different from all other non-NaN values (including "nil"/unmapped). Note that "nil" in (which matches unspecified MediaWiki values) is NOT "NaN" and is also NOT an "empty string" value (which is only one of the possible default values for unspecified but cannot match any "NaN"). "NaN" in Mediawiki is a string, and does not match the Lua NaN numeric value.
:: It's only when we start speaking about fuzzy logic with measured accuracy, that we introduce many other values by representing fuzzy booleans by a real number between 0.0 and 1.0, but then even this accuracy may be sometimes not measurable and will also needs an additional Nan value, so fuzzy booleans have infinite number of values between 0.0 and 1.0 plus Nan; (0.0 representing absolutely false, 0.5 representing false or true equally, and 1.0 representing absolutely true, the rule being that if you sum the response rates, you'll get always 1.0). [User:Verdy p|verdy_p]] ([[User talk:Verdy p|talk]])
:: It's only when we start speaking about fuzzy logic with measured accuracy, that we introduce many other values by representing fuzzy booleans by a real number between 0.0 and 1.0, but then even this accuracy may be sometimes not measurable and will also needs an additional Nan value, so fuzzy booleans have infinite number of values between 0.0 and 1.0 plus Nan; (0.0 representing absolutely false, 0.5 representing false or true equally, and 1.0 representing absolutely true, the rule being that if you sum the response rates, you'll get always 1.0). [User:Verdy p|verdy_p]] ([[User talk:Verdy p|talk]])
:: In summary the statement "-- NaNs can't be table keys, and they are also unique, so we don't need to check existence." is compeltely false: not checking this existence means that the line "ret[#ret + 1] = v" will be executed multiple times (each time "v" is NaN) so "ret[]" will contain multiple "NaN" values... And this is not expected.
:: Note you may want to have "uniqueNan" set to true by default (the code above keeps the existing default behavior when the optional "uniqueNan" parameter is not specified, i.e. it keeps these duplicates).
:: If you drop that optional parameter, by forcing the function to behave like if it was always true, all that is needed is to change the line "if uniqueNan == nil or uniqueNan == false or uniqueNan == true and not hasNan then" to "if not hasNan then"... [[User:Verdy p|verdy_p]] ([[User talk:Verdy p|talk]]) 20:40, 5 August 2018 (UTC)

Revision as of 20:40, 5 August 2018

Template:Oldtfdfull

removeDuplicate does not remove duplicate NaN

<source lang="lua"> function p.removeDuplicates(t) checkType('removeDuplicates', 1, t, 'table') local isNan = p.isNan local ret, exists = {}, {} for i, v in ipairs(t) do if isNan(v) then -- NaNs can't be table keys, and they are also unique, so we don't need to check existence. ret[#ret + 1] = v else if not exists[v] then ret[#ret + 1] = v exists[v] = true end end end return ret end </source> This should be: <source lang="lua"> function p.removeDuplicates(t, uniqueNan) checkType('removeDuplicates', 1, t, 'table') local ret, isNan, hasNan, exists = {}, p.isNan, false, {} for _, v in ipairs(t) do -- NaNs can't be table keys in exists[], and they are also equal to each other in Lua. if isNan(v) then -- But we may want only one Nan in ret[], and there may be multiple Nan's in t[]. if uniqueNan == nil or uniqueNan == false or uniqueNan == true and not hasNan then hasNan = true ret[#ret + 1] = v end else if not exists[v] then exists[v] = true ret[#ret + 1] = v end end end return ret end </source> -- verdy_p (talk) 07:50, 2 February 2014 (UTC)

@Verdy p: This was by design, as comparing two NaNs always results in false. My reasoning was that since two NaNs can never be equal to each other - even if they were made by the exact same calculation - then they shouldn't be treated as duplicates by the algorithm. Although if there's some sort of precedent for doing things a different way, please let me know. I'm fairly new to the world of NaNs, after all. — Mr. Stradivarius ♪ talk ♪ 08:01, 2 February 2014 (UTC)
That's the Lua interpretation anyway. Even if it has a single Nan value (no distinction between signaling and non-signaling ones, or Nan's carrying an integer type, like in IEEE binary 32-bit float and 64-bit double formats, neither does Java...), there are some apps that depend on using Nan as a distinctive key equal to itself, but still different from nil.
The other kind of usage of Nan is "value not set, ignore it": when computing averages for example, Nan must not be summed and not counted, so all Nan's should be removed from the table. For this case May be there should be an option to either preserve all Nan's, or nuke them all from the result: the kill option would be tested in the if-branch of your first version, and a second alternate option tested after it would be to make Nan's unique in the result.... The first case being quite common for statistics when it means "unset", while nil means something else (such as compute this value before determinig if it's a Nan, nil bring used also for weak references that can be retreived from another slow data store, and the table storing nil being a fast cache of that slow data store) verdy_p (talk) 08:29, 2 February 2014 (UTC)
I had a quick look at the functions, but cannot work out the intended usage—that usage would probably determine what should happen with NaNs. The docs for removeDuplicates says that keys that are not positive integers are ignored—but what it means is that such keys are removed. On that principle, I think it would be better to default to removing NaNs. I cannot imagine a usage example where I would want to call this function and have a NaN in the result—what would I do with it? If it were desirable to test for the presence of NaN members, why not return an extra value that is true if one or more NaNs are encountered and omitted? How would it help to ever have both 0/0 and -0/0 (or multiple instances of 0/0) in the result?
Regarding verdy_p's code: I don't think it is a good idea to deviate from Lua's idioms, and if there were a uniqueNan parameter, it should not be tested for explicit "false" and "true" values. The function regards "hello" as neither false nor true and while that is a very defensible position, it's not how Lua code is supposed to work. Johnuniq (talk) 03:26, 3 February 2014 (UTC)
It takes three values (it is a "tri-state boolean"):
  • nil the default is your existing code that keeps them all,
  • true keeps a single Nan,
  • false does not even keep this one
  • any other (unsupported) value in other types is considered like nil.
There's no such "Lua idiom" (about NaN) except that the default value is nil (already a common Lua practive), and otherwise the parameter is used as a boolean flag whose value is meant by its explicit name (and is a general good practive for naming boolean flags)... Tri-state boooleans are very common objects in many situations where we need a supplementary "may be / unknown / don't know / other / not specified / not available / missing data / to be determined later / uninitialized / refuse to say / no opinion / fuzzy" value. This fuzzy concept is also carried by Nan (within the field of numbers instead of booleans) so it is really appropriate here !
There's also no such ideom of "NaN" in Mediawiki parameters. Given that Lua already states that multiple NaN values compare effectively as "equal", it makes no sense to keep these duplicates which occur occasionally as result of computed numeric expressions (numbers in Lua never distinguish any NaN, neither does MediaWiki). You "idiom" is that used in Java or C++, which is NOT directly usd in MadiaWiki or Lua (hidden in the underlying internal representation). Keeping these duplicates should complicates situation where we expect "keys" in Lua tables to be unique. Not doing that means that we'll not properly detect overrides, and it will be impossible to map any action of value to NaN, meaning that we'll always get unspecified results that we can never catch in MediaWiki templates (this means undetected errors and unspecified behavior, wrong results in all cases). Lua treats all NaN as unique values which compare equal between each other, but different from all other non-NaN values (including "nil"/unmapped). Note that "nil" in (which matches unspecified MediaWiki values) is NOT "NaN" and is also NOT an "empty string" value (which is only one of the possible default values for unspecified but cannot match any "NaN"). "NaN" in Mediawiki is a string, and does not match the Lua NaN numeric value.
It's only when we start speaking about fuzzy logic with measured accuracy, that we introduce many other values by representing fuzzy booleans by a real number between 0.0 and 1.0, but then even this accuracy may be sometimes not measurable and will also needs an additional Nan value, so fuzzy booleans have infinite number of values between 0.0 and 1.0 plus Nan; (0.0 representing absolutely false, 0.5 representing false or true equally, and 1.0 representing absolutely true, the rule being that if you sum the response rates, you'll get always 1.0). [User:Verdy p|verdy_p]] (talk)
In summary the statement "-- NaNs can't be table keys, and they are also unique, so we don't need to check existence." is compeltely false: not checking this existence means that the line "ret[#ret + 1] = v" will be executed multiple times (each time "v" is NaN) so "ret[]" will contain multiple "NaN" values... And this is not expected.
Note you may want to have "uniqueNan" set to true by default (the code above keeps the existing default behavior when the optional "uniqueNan" parameter is not specified, i.e. it keeps these duplicates).
If you drop that optional parameter, by forcing the function to behave like if it was always true, all that is needed is to change the line "if uniqueNan == nil or uniqueNan == false or uniqueNan == true and not hasNan then" to "if not hasNan then"... verdy_p (talk) 20:40, 5 August 2018 (UTC)