Optimizing loops #1601

mkskeller · 2024-11-29T09:37:41Z

mkskeller
Nov 29, 2024

I've tried to port the cross-tabulation example from https://github.com/MPC-SoK/frameworks as follows:

def xtabs(data):
    xid, yid, vals, cats = data
    res = [0] * 5
    for i in range(n):
        for j in range(n):
            for k in range(len(res)):
                add = (xid[i] == yid[j]) * (cats[j] == k) * vals[i]
                res[k] += add
    return res

It gives me the correct results, but it seems slower than most frameworks in the above repository. Is there a way of improving it like vectorized operation or reducing the number of bits used in the comparisons? I cannot find any hints in the documentation. I'm using the latest Docker container.

Full code:


import sys

import secretflow as sf
import spu

# Check the version of your SecretFlow
print('The version of SecretFlow: {}'.format(sf.__version__))

# In case you have a running secretflow runtime already.
sf.shutdown()

sf.init(['alice', 'bob', 'carol', 'dave'], address='local')


aby3_config = sf.utils.testing.cluster_def(parties=['alice', 'bob', 'carol'])


spu_device = sf.SPU(aby3_config)


alice, dave = sf.PYU('alice'), sf.PYU('dave')


spu_io = sf.device.SPUIO(spu_device.conf, spu_device.world_size)


n = int(sys.argv[1])
data = [list(range(n))] * 3 + [[i % 5 for i in range(n)]]

pyu = alice(lambda: data)()

spu_data = pyu.to(spu_device)

def xtabs(data):
    xid, yid, vals, cats = data
    res = [0] * 5
    for i in range(n):
        for j in range(n):
            for k in range(len(res)):
                add = (xid[i] == yid[j]) * (cats[j] == k) * vals[i]
                res[k] += add
    return res

res = spu_device(xtabs)(spu_data)

print(sf.reveal(res))
print('expected', xtabs(data))
print('size', n)

tpppppub · 2024-12-02T09:36:49Z

tpppppub
Dec 2, 2024

Hi @mkskeller, as SF/SPU uses Jax for frontend Python programs, you could use Jax's powerful vectorized capability to optimize this program like below, which may help XLA to do more hardware-independent optimizations and produce much better codes.

    data = jnp.array([list(range(n))] * 3 + [[i % 5 for i in range(n)]])

    def xtabs(data):
        xid, yid, vals, cats = data
        n = len(xid)
        num_k = 5  
        def map_i(i):
            def map_j(j):
                def map_k(k):
                    return (xid[i] == yid[j]) * (cats[j] == k) * vals[i]

                return jax.vmap(map_k)(jnp.arange(num_k))

            return jnp.sum(jax.vmap(map_j)(jnp.arange(n)), axis=0)

        res = jnp.sum(jax.vmap(map_i)(jnp.arange(n)), axis=0)
        return res

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SecretFlow

Optimizing loops #1601

{{title}}

Replies: 1 comment

{{title}}

Select a reply

SecretFlow

Optimizing loops #1601

mkskeller Nov 29, 2024

Replies: 1 comment

tpppppub Dec 2, 2024

mkskeller
Nov 29, 2024

tpppppub
Dec 2, 2024