Fafb062f6e8c394dd6b0689c13db2b88

Find the intersection of arrays contained within a hash. So given {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]} return [3,5]

1
2
3
4
5
6
7
8
9
pools = {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}
intersection = []
pools.each do |k,v|
  not_k = pools.dup
  not_k.delete_if{|nk,nv| nk == k }
  inter << (v & not_k.values.flatten)
end
intersection.flatten.uniq

Refactorings

No refactoring yet !

5170ca260dbd2cdfd5a887a4dba7636f

Jeremy

July 31, 2008, July 31, 2008 21:59, permalink

1 rating. Login to rate!

Here are a couple of options, depending on which version of Ruby you're using. Unfortunately Array#count is not available in Ruby 1.8.6 (although you could add it to the class pretty easily if you wanted to).

1
2
3
4
5
6
7
8
9
10
# Ruby 1.8.6
pools = {1 => [1,2,3], 2 => [3,4,5], 3 => [5,6,7]}
values = pools.values.flatten
intersection = values.select { |v| values.select { |w| w == v }.size > 1 }.uniq


# Ruby 1.8.7+
pools = {1 => [1,2,3], 2 => [3,4,5], 3 => [5,6,7]}
values = pools.values.flatten
intersection = values.select { |v| values.count(v) > 1 }.uniq
1e8f141e7857d397d8020ed3b759e88a

Maciej Piechotka

July 31, 2008, July 31, 2008 22:03, permalink

No rating. Login to rate!

May be something like that

1
2
3
pools = {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}
rej = Hash.new {0}
pools.values.flatten.sort.reject! {|elem| rej[elem]++ != 1}
Fafb062f6e8c394dd6b0689c13db2b88

AviFlombaum

July 31, 2008, July 31, 2008 22:14, permalink

No rating. Login to rate!

Nice! However, the issue with your method is on large data sets that select statement requires you to iterate over each element in the array which adds significant time to the task. My method above, while not as pretty and concise, save significant processing time. I guess I should have mentioned that performance is a huge issue on this one.

Aedd89a10aba3a46576ea4f604146c65

badcarl

July 31, 2008, July 31, 2008 23:06, permalink

1 rating. Login to rate!
1
2
3
4
5
require "rubygems"
require "facets/enumerable/probability"

pools = {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}  
intersection = pools.values.flatten.frequency.delete_if { |k, v| v == 1 }.keys
Fafb062f6e8c394dd6b0689c13db2b88

AviFlombaum

August 1, 2008, August 01, 2008 02:21, permalink

No rating. Login to rate!

badcarl - thanks! That's awesome and fast - but in the end, I like my method best - requires no gems, is hella fast (uses less mem and runs faster on large sets) and I can pretty it up a bit. Thanks though, you both rock!

Aedd89a10aba3a46576ea4f604146c65

badcarl

August 1, 2008, August 01, 2008 04:06, permalink

No rating. Login to rate!

Here is another way. Keep a set of numbers that have appeared once. Add things to the intersection if they've appeared already.

1
2
3
4
5
6
7
require "set"

pools = {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}  
intersection, appeared = Set.new, Set.new
pools.values.flatten.each do |value|
  appeared.include?(value) ? intersection << value : appeared << value
end
1e8f141e7857d397d8020ed3b759e88a

Maciej Piechotka

August 1, 2008, August 01, 2008 07:03, permalink

No rating. Login to rate!

The sort is unnecessary in my code

1
2
3
pools = {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}
rej = Hash.new {0}
pools.values.flatten.reject! {|elem| rej[elem]++ != 1}
3122b90b43b4bdf9078ca287a657866c

kmcd

August 1, 2008, August 01, 2008 10:08, permalink

No rating. Login to rate!

Hi all,

This is refactoring is similar to badcarls set based suggestion. I haven't analysed the complexity of this solution yet. I'll do a performance test & dig into my old big O notes ... later.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
class Hash
  # Returns duplicate elements of values arrays. Usage:
  # {1 => [1,2,3], 2 => [3,4,5], 3 => [5,6,7]}.intersection
  # => [3,5]
  def intersection
    self.values.flatten.duplicates
  end
end

class Array
  # Returns an array of duplicate elements. Usage:
  # [1,2,3,3].duplicates
  # => [3]
  def duplicates
    dups, unique = [], []
    self.each { |e| unique.member?(e) ? dups.push(e) : unique.push(e) }
    dups
  end
end

require 'test/unit'

class TestArrayDuplicates < Test::Unit::TestCase
  def test_return_empty_with_all_uniq
    assert_equal [1,2,3].duplicates, []
  end
  
  def test_return_duplicates_with_duplicate_elements
    assert_equal [1,2,3,3].duplicates, [3]
  end
end

class TestHashIntersection < Test::Unit::TestCase
  # Using setup for clarity. 
  # assert_equal wont accept an unitialised hash
  # ie assert_equal {}, {} gives a syntax error
  def setup
    @dups_hash = {1 => [1,2], 2 => [2,3], 3 => [3,4]}
    @uniq_hash = {1 => [1,2,3], 2 => [4,5], 3 => [6,7]}
  end
  
  def test_return_empty_with_all_uniq
    assert_equal @uniq_hash.intersection, []
  end
  
  def test_return_duplicates_with_duplicate_elements
    assert_equal @dups_hash.intersection, [2,3]
  end
end
D8daae35e495689a5f17d86f15d18eb4

rjspotter

August 6, 2008, August 06, 2008 22:01, permalink

No rating. Login to rate!

I'm basicly golfing at this point but it was a fun exersize

1
2
3
4
5
6
7
pools = {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}

pools.inject([]) {|m,x| m << x[1].select {|y| pools[x[0] + 1].include?(y) unless pools[x[0] + 1].nil?}}.flatten

#or

pools.inject([]) {|m,x| m << (x[1] & pools[x[0] + 1] unless pools[x[0] + 1].nil?)}.compact.flatten
B92160f71906401911a87e1620ec3b94

mike

August 11, 2008, August 11, 2008 01:15, permalink

No rating. Login to rate!

Read this thread and figured Enumerable#nonuniq from facets run on pools.values.flatten would probably be faster... it was!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
irb(main):001:0> @pools = {1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}
=> {1=>[1, 2, 3], 2=>[3, 4, 5], 3=>[5, 6, 7]}
irb(main):002:0> 
irb(main):003:0* def method1
irb(main):004:1>   intersection = []
irb(main):005:1>   @pools.each do |k,v|
irb(main):006:2*     not_k = @pools.dup
irb(main):007:2>     not_k.delete_if{|nk,nv| nk == k }
irb(main):008:2>     intersection << (v & not_k.values.flatten)
irb(main):009:2>   end
irb(main):010:1>   intersection.flatten.uniq
irb(main):011:1> end
=> nil
irb(main):012:0> 
irb(main):013:0* module Enumerable
irb(main):014:1>   # from facets
irb(main):015:1*   def nonuniq
irb(main):016:2>     h1 = {}
irb(main):017:2>     h2 = {}
irb(main):018:2>     each {|i|
irb(main):019:3*       h2[i] = true if h1[i]
irb(main):020:3>       h1[i] = true
irb(main):021:3>     }
irb(main):022:2>     h2.keys
irb(main):023:2>   end
irb(main):024:1> end
=> nil
irb(main):025:0> 
irb(main):026:0* require 'rubygems'; require 'benchmark'
=> true
irb(main):027:0> Benchmark.bm do |x| 
irb(main):028:1*   x.report("method1") { 100_000.times { method1 }}
irb(main):029:1>   x.report("method2") { 100_000.times { @pools.values.flatten.nonuniq }}
irb(main):030:1> end
      user     system      total        real
method1  4.070000   0.440000   4.510000 (  4.573123)
method2  1.790000   0.250000   2.040000 (  2.046597)
=> true
Fafb062f6e8c394dd6b0689c13db2b88

AviFlombaum

August 11, 2008, August 11, 2008 02:30, permalink

No rating. Login to rate!

Can you benchmark off something like this to find all intersections?
{1 => [1..50000], 2=> [20000..40000], 3=> [30000..50000]}

Avatar

leemic

August 13, 2008, August 13, 2008 02:36, permalink

No rating. Login to rate!

Can you please clarify your example? This does not make sense how you are finding intersection. Do you mean that the element appears on all the arrays?

For example, you are given the following hash

{1 => [1,2,3], 2=> [3,4,5], 3=> [5,6,7]}

This means you have three different arrays

[ 1, 2, 3 ]
[ 3, 4, 5 ]
[ 5, 6, 7 ]

There is no value that exists in all three arrays.

Your answer is [ 3, 5 ]. This means 3 came from the intersection of the first and second arrays. 5 came from the intersection of the second and third arrays.

Let's do another example

{1 => [1,2,3], 2=> [3,4,5], 3=> [1, 5,6] }

The answer should be [ 1, 3, 5 ]. However, '1' came from the intersection between the first and third arrays.

Anyway, if you need the intersection of all three arrays - here is one liner

1
2
3
require 'set'

intersect = pools.values.collect{ |arr| Set.new(arr) }.inject{ |inter, set| inter.nil? ? set : inter & set } 
Fafb062f6e8c394dd6b0689c13db2b88

AviFlombaum

August 13, 2008, August 13, 2008 05:43, permalink

No rating. Login to rate!

I mean the element appears in at least one other array besides its own.

[ 1, 2, 3 ]
[ 3, 4, 5 ]
[ 5, 6, 7 ]

is [3,5], for the reason you pointed out.

In your example: {1 => [1,2,3], 2=> [3,4,5], 3=> [1, 5,6] }

[1,3,5] is correct. I'll test out your method but I was hoping to not require any gems. Does this make sense?

Your refactoring





Format Copy from initial code

or Cancel