Jump to content

Module talk:TableTools

From Fifth Empire Wiki
Revision as of 08:29, 2 February 2014 by wikipedia>Verdy p

removeDuplicate does not remove duplicate NaN

<source lang="lua"> function p.removeDuplicates(t) checkType('removeDuplicates', 1, t, 'table') local isNan = p.isNan local ret, exists = {}, {} for i, v in ipairs(t) do if isNan(v) then -- NaNs can't be table keys, and they are also unique, so we don't need to check existence. ret[#ret + 1] = v else if not exists[v] then ret[#ret + 1] = v exists[v] = true end end end return ret end </source> This should be: <source lang="lua"> function p.removeDuplicates(t) checkType('removeDuplicates', 1, t, 'table') local ret, isNan, exists, hasNan = {}, p.isNan, {}, nil for _, v in ipairs(t) do -- NaNs can't be table keys in exists[], and they are also equal to each other in Lua. if isNan(v) then -- But we want only one Nan in ret[], and there may be multiple Nan's in t[]. if not hasNan then hasNan = true ret[#ret + 1] = v end else if not exists[v] then exists[v] = true ret[#ret + 1] = v end end end return ret end </source> -- verdy_p (talk) 07:50, 2 February 2014 (UTC)

@Verdy p: This was by design, as comparing two NaNs always results in false. My reasoning was that since two NaNs can never be equal to each other - even if they were made by the exact same calculation - then they shouldn't be treated as duplicates by the algorithm. Although if there's some sort of precedent for doing things a different way, please let me know. I'm fairly new to the world of NaNs, after all. — Mr. Stradivarius ♪ talk ♪ 08:01, 2 February 2014 (UTC)
That's the Lua interpretation anyway. Even if it has a single Nan value (no distinction between signaling and non-signaling ones, or Nan's carrying an integer type, like in IEEE binary 32-bit float and 64-bit double formats, neither does Java...), there are some apps that depend on using Nan as a distinctive key equal to itself, but still different from nil.
The other kind of usage of Nan is "value not set, ignore it": when computing averages for example, Nan must not be summed and not counted, so all Nan's should be removed from the table. For this case May be there should be an option to either preserve all Nan's, or nuke them all from the result: the kill option would be tested in the if-branch of your first version, and a second alternate option tested after it would be to make Nan's unique in the result.... The first case being quite common for statistics when it means "unset", while nil means something else (such as compute this value before determinig if it's a Nan, nil bring used also for weak references that can be retreived from another slow data store, and the table storing nil being a fast cache of that slow data store) verdy_p (talk) 08:29, 2 February 2014 (UTC)