Interview question: data structure to set all values in O(1)

I encountered the following interview question on the Internet.

Describe a data structure for which getValue(int index), setValue(int index, int value), and setAllValues(int value) are all O(1).

Though array is good enough for the first and second operations to be performed in O(1), what can be proposed for the third (setAllValues)?

data-structures

2012-04-04 05:44
by Alec

@leppie I disagree - see my answer : - Timothy Jones 2012-04-04 06:05

@TimothyJones: OK, I understand your answer, but it is 'cheating'. You are only setting 1 value, not all values - leppie 2012-04-04 06:08

@leppie The data structure would work as specified, so I'm not sure what's cheating about it ; - Timothy Jones 2012-04-04 07:27

Is it feasible to implement by a hash table - Hengameh 2015-05-02 08:15

@RoyiNamir: But as soon as you set a value again, you get the old values back in your code - leppie 2016-01-10 19:05

How about an array of tuples {timestamp, value}, with an additional {timestamp, value} called all. Since you only care about the relative time of insertions, you can use a monotonically increasing id for the values of timestamp:

type Entry {
  int timestamp, 
  int value
}

type structure {
    int     id
    Entry   all
    Entry[] array
}

Initialise all fields to 0. Then the following should work for you:

setValue(index i, value v):
```
array[i] = {id++, value}
```

value getValue(index i)

if(all.timestamp > array[i].timestamp) return all.value
else return array[i].value

setAll(value v)
```
all = {id++, value}
```

A problem with this approach is that eventually you'll run out of ids for timestamp, and might wrap around. If you chose a 64 bit value to store timestamps, then this gives you 18,446,744,073,709,551,616 insertions or setAlls before this happens. Depending on the expected use of the datastructure, an O(n) cleanup phase might be appropriate, or you could just throw an exception.

Another issue that might need to be considered is multi-threading. Three obvious problems:

if id++ isn't atomic and two threads obtained a new id at the same time then you could get two entries with the same id.
if the incrementation of id and the assignment of the created tuple aren't atomic together (they're probably not) and there were two simultaneous inserts, then you could get a race condition where the older id wins.
if the assignment of the tuple isn't atomic, and there's an insert() at the same time as a get() then you might end up in a position where you've got say {new_id, old_value} in the array, causing the wrong value to be returned.

If any of these are problems, the absolute easiest solution to this is to put "not thread safe" in the documentation (cheating). Alternatively, if you can't implement the methods atomically in your language of choice, you'd need to put some sort of synchronisation locks around them.

2012-04-04 06:03
by Timothy Jones

But we can;t be sure that how it works internally. This single statement can be O(n) - ganesshkumar 2012-04-04 06:05

This relies on every call to current_time() returning a different value though (otherwise, two timestamps may be equal - Damien_The_Unbeliever 2012-04-04 06:07

@DamienTheUnbeliever that's true. You may also have to deal with wrap-around cases too, if the program was around for a while - Timothy Jones 2012-04-04 06:15

Thanks Timothy. Perfect - Alec 2012-04-04 06:42

@DamienTheUnbeliever So, I was writing up an edit that suggested using a tuple of {system_timestamp,id} in place of timestamps to get around this, when I realised that there's no need to involve system timestamps at all. See edit : - Timothy Jones 2012-04-04 06:43

@Alec You're welcome. See edit for an improved version : - Timothy Jones 2012-04-04 06:44

@TimothyJones - yes, switching to the monotonic id is an obvious fix :-) - you might also mention in the "interview" that multi-threading issues may also need to be addressed - Damien_The_Unbeliever 2012-04-04 06:46

@DamienTheUnbeliever Quite right. Maybe we should interview you next time ; - Timothy Jones 2012-04-04 07:10

what do you mean by "an array of tuples"? Is it an array with two fields {timestamp, value}? something like HashMap? (I couldn't understand your approach) - Hengameh 2015-07-23 12:18

Your signature for getValue() needs to be fixed - Patrick Roberts 2017-12-18 07:57

Thanks @PatrickRoberts! Good catch - Timothy Jones 2017-12-19 07:05

Can we use a boolean variable instead of timestamp? whenever setAll is called we set it to true and use the value we store on the side, whenever setValue is called this variable becomes false - Tam211 2019-02-13 16:35

@Tam211 - if you did that, SetAll would become O(n), because you have to update the timestamp bool for each item - Timothy Jones 2019-02-14 02:10

@TimothyJones My idea was to eliminate the timestamp bool, store only value for each element, and use a global boolean to tell us where to get the value from, if it's true we use the value which was set by setAll() if it's false we use the stored value in the array - Tam211 2019-02-15 09:00

@Tam211 I don't think that works. If you set the value at location X to 10, use setAll(100), set the value at Y to anything, then read the value at X, the single boolean approach would give the wrong answer - Timothy Jones 2019-02-19 04:30

I got the same question in one of the technical interviews.
Here is my complete ready-to-use Java-implementation including the test cases.

The key idea is to keep the value of setAll() in special variable (e.g. joker) and then trace the change of this value in a proper way.

In order to keep the code clean, some access modifiers have been abolished.

Node class:

import java.time.LocalDate;

class Node {

    int value;
    LocalDate jokerSetTime;

    Node(int val, LocalDate jokSetTime) {
        value = val;
        jokerSetTime = jokSetTime;
    }
}

DS class:

class DS {

    Node[] arr;

    DS(int len) {
        arr = new Node[len];
    }
}

DataStructure class:

import java.time.LocalDate;

class DataStructure {

    private boolean isJokerChanged;
    private Integer joker;
    private LocalDate jokerSetTime;
    private DS myDS;

    DataStructure(int len) {
        myDS = new DS(len);
    }

    Integer get(int i) {

        Integer result;

        if (myDS.arr.length < i) {
            return null;
        }

        // setAll() has been just called
        if (isJokerChanged) {
            return joker;
        }

        if (myDS.arr[i] == null) {

            // setAll() has been previously called
            if (joker != null) {
                result = joker;
            } else {
                result = null;

            }

        } else if (myDS.arr[i].jokerSetTime == jokerSetTime) {
            // cell value has been overridden after setAll() call
            result = myDS.arr[i].value;
        } else {
            result = joker;
        }

        return result;
    }

    void setAll(int val) {
        isJokerChanged = true;
        joker = val;
        jokerSetTime = LocalDate.now();
    }

    void set(int i, int val) {
        isJokerChanged = false;
        myDS.arr[i] = new Node(val, jokerSetTime);
    }
}

Main class:

class Main {

    public static void main(String[] args) {

        DataStructure ds = new DataStructure(100);

        Integer res;

        res = ds.get(3);

        ds.set(3, 10);

        res = ds.get(3);

        ds.setAll(6);

        res = ds.get(3);

        res = ds.get(15);

        ds.set(4, 7);

        res = ds.get(4);

        res = ds.get(3);

        ds.setAll(6);

        ds.set(8, 2);

        res = ds.get(3);
    }
}

Update:
The code has been updated. The previous implementation didn't take into account the case when setAll() is called twice in a row with the same value and is followed by set(x), get(y), e.g.: setAll(100), set(3, 1), setAll(100), set(5, 3), get(3).

The use of timestamp approach has been added to allow distinguishing between different setAll() calls with identical values.

P.S. This implementation is not a thread safe.

2017-01-17 07:17
by Mike B.

I have a strong hunch that this was expected. All the timestamp based solutions add completely unnecessary complexity - user4815162342 2018-04-10 14:15

@MikeB. Great implementation! Very clean and clear. At first I was having trouble understanding the previous solutions with all of the timestamps, but this is just amazing, exactly what I was looking for. Thanks again - Martin 2018-06-10 09:54

This solution is wrong. Consider the following example in the main function: ds.setAll(6); ds.set(3,10); ds.setAll(6); ds.set(4,1); res = ds.get(); It will return the incorrect value of 10, even though setAll(6) was called after it. The second ds.set() call is needed to clear the isJokerChanged variable, which would have superficially defended against this bug otherwise - AnatolyVorobey 2018-07-12 17:29

@AnatolyVorobey, the code has been fixed. Thanks for the test case - Mike B. 2018-07-15 12:34

How about an array of pointers to a single common value? Set the value, and all references will point to the single changed value in O(1)..

2012-04-04 06:06
by WOPR

Answers the question though! ;) lol - WOPR 2012-04-04 06:07

This works until you want to change one of the elements to point to something other than the common value - Timothy Jones 2012-04-04 06:09

That wasn't a requirement specified in the question - WOPR 2012-04-04 06:10

I think it is - if you call setAll(), and then setValue(), you'd have to change one of the elements - Timothy Jones 2012-04-04 06:16

I would say that specs clearly imply that it should be possible to set values individually without affecting the other values. Otherwise the index parameter to the setValue method is superfluous. And if you do not want to interpret method signatures as specifications, then it would be possible to simply implement all of the methods as NoOps.. - Alderath 2012-04-04 08:57

I was just asked this question very recently in an Interview. I came up with a hash table implementation. It would solve the problem of running out of the timestamp values but the thread safety feature needs to be implemented (probably using lazy initialization techniques)

Let's say in our class we have a private variable _defaultValue to hold a default value and we also have a private hashtable or dictionary _hashtable. SetAllValues could just set _defaultValue equal to the value passed and _hashtable initialized/set to a new hashtable and discard any reference to the old hash table. SetValue should just add a new value to _hashtable or update the value if the key ( or index) is already present in the _hashtable. GetValue should check whether the key (or index) is present in _hashtable, then return it, else return the value stored in _defaultValue.

This is my first answer on StackOverflow. I am a little lazy in writing up the code. Will probably edit the answer soon.

The interviewer nodded yes to this solution but insisted to implement it without using a hashtable. I guess, he was asking me to give a similar approach as Timothy's answer. And I wasn't able to get it at that moment :(. Anyways, Cheers!

EDIT: Posting the code (in C#)

class MyDataStruct
{
    private int _defaultValue;
    private Dictionary<int,int> _hashtable;

    public MyDataStruct()
    {
        _defaultValue = 0; // initialize default with some value
        _hashtable = new Dictionary<int, int>();
    }

    public void SetAllValues(int val)
    {
        _defaultValue = val;
        _hashtable = new Dictionary<int, int>();
    }

    public void SetValue(int index, int val)
    {
        if (_hashtable.ContainsKey(index))
        {
            _hashtable.Add(index, val);
        }
        else
        {
            _hashtable[index] = val;
        }
    }

    public int GetValue(int index)
    {
        if (_hashtable.ContainsKey(index))
        {
            return _hashtable[index];
        }
        else
        {
            return _defaultValue;
        }
    }
}

2014-08-23 07:24
by tapas nayak

could you please provide your code with hash table - Hengameh 2015-05-02 06:07

@Hengameh -- added - tapas nayak 2015-07-20 06:30

This is almost correct except that GetValue for a non existing hey would always return default value even when this key was not inserted to the data structure - akrabi 2015-11-18 18:11

We can have a variable V which stores an int and an array of containing Tuple as {Value, id}..

And a global int variable G (which will act like identity in SQL and whenever any set or setAll operation is done its value get increment by 1)

initial all Ids and V value will be default say null..

so V = null All Tuple = {null, null}
set(index i, val v) -> G = G+1, arr[i].Val = v, arr[i].Id = G

get(index i) -> if V.Id > arr[i].Id return V.Val else return arr[i]



set-all(val v) -> G= G+1, V.Id= G, V.Val = v

2014-08-23 08:11
by Pramod Gupta

All existing answers use a timestamp that is incremented on every setVal operation. This is not necessary. In fact, it is necessary only to increment the timestamp on setAll. Another issue some raised was arithmetic overflow. This can be handled without breaking the constant time bounds by updating a single cell on each setAll and performing the time comparison carefully.

How it works

The basic concept is essentially similar to that of the other answers, but with a twist.

What they do: Store the value used for the last setAll operation separately, and keep track of the time that operation was performed. Each time they perform a setVal, they store the current time along with the given value in the array. Each time they perform a getVal, they compare the time in the given location to the time the last setAll occurred, and then choose either the value at the location or the setAll value depending on which is greater.

Why this can be a problem: Suppose the current time overflows and a setAll operation occurs soon afterward. It will appear as though most of the stored array values are newer than the setAll value, when they are actually older.

The solution: Stop imagining that we're keeping track of the total amount of time that has passed since data structure initialization. Imagine a giant clock with a "second hand" that ticks not 60 times around the circle but rather 2^n times around the circle. The position of the second hand represents the time of the most recent setAll operation. Each setVal operation stores this time along with a value. So if we perform a setAll when the "clock" is at 45, and then perform six setVal operations on different elements, the setAll time and the times at all six locations will be the same. We wish to maintain the following invariant:

The time at a given element location equals the setAll time if and only if that element was set with setVal more recently than the last setAll operation.

Clearly, the procedure described above automatically ensures that if the element was set recently, then its time will equal the setAll time. The challenge is to make the reverse implication hold as well.

To be continued ....

The code

I've written this in Haskell because that is the language I know best, but it is not exactly the most natural language for the job.

{-# LANGUAGE BangPatterns #-}
module RepArr where

import Control.Monad.Primitive
import Data.Primitive.MutVar
import qualified Data.Vector.Mutable as V
import Data.Vector.Mutable (MVector)
import Control.Applicative
import Prelude hiding (replicate)
import Control.Monad.ST
import Data.Word

-- The Int in the MutVar is the refresh pointer
data RepArr s a = RepArr (MutVar s (Word, a, Int)) (MVector s (Word,a))

-- Create a fancy array of a given length, initially filled with a given value
replicate :: (PrimMonad m, Applicative m) => Int -> a -> m (RepArr (PrimState m) a)
replicate n a = RepArr <$> newMutVar (0,a,0) <*> V.replicate n (0,a)

getVal :: PrimMonad m => RepArr (PrimState m) a -> Int -> m a
getVal (RepArr allv vec) n = do
  (vectime, vecval) <- V.read vec n
  (alltime, allval, _) <- readMutVar allv
  if (fromIntegral (alltime - vectime) :: Int) > 0
    then return allval
    else return vecval

setVal :: PrimMonad m => RepArr (PrimState m) a -> Int -> a -> m ()
setVal (RepArr allv vec) i v = do
  (!alltime, _, _) <- readMutVar allv
  V.write vec i (alltime, v)

setAll :: PrimMonad m => RepArr (PrimState m) a -> a -> m ()
setAll r@(RepArr allv vec) v = do
  (oldt, _, oldc) <- readMutVar allv
  getVal r oldc >>= setVal r oldc
  let !newc = case oldc+1 of
        op1 | op1 == V.length vec -> 0
            | otherwise -> op1
  let !newt = oldt+1
  writeMutVar allv (newt, v, newc)

To avoid potential (rare) garbage collection pauses, it's actually necessary to unbox the Int and Word values, as well as using unboxed vectors instead of polymorphic ones, but I'm not really in the mood and this is a fairly mechanical task.

Here's a version in C (completely untested):

#include <malloc.h>

struct Pair {
  unsigned long time;
  void* val;
};

struct RepArr {
  unsigned long allT;
  void* allV;
  long count;
  long length;
  struct Pair vec[];
};

struct RepArr *replicate (long n, void* val) {
  struct RepArr *q = malloc (sizeof (struct RepArr)+n*sizeof (struct Pair));
  q->allT = 1;
  q->allV = val;
  q->count = 0;
  q->length = n;

  int i;
  for (i=0; i<n; i++) {
    struct Pair foo = {0,val};
    q->vec[i] = foo;
  }
  return q;
}


void* getVal (struct RepArr *q, long i) {
  if ((long)(q->vec[i].time - q->allT) < 0)
    return q->allV;
  else
    return q->vec[i].val;
}

void setVal (struct RepArr *q, long i, void* val) {
  q->vec[i].time = q->allT;
  q->vec[i].val = val;
}

void setAll (struct RepArr *q, void* val) {
  setVal (q, q->count, getVal (q, q->count));
  q->allV = val;
  q->allT++;
  q->count++;
  if (q->count == q->length)
    q->count = 0;
}

2015-03-06 01:43
by dfeuer

/*

At the time I am writing this, all of the solutions on this page will double (or more) the amount of space required to store the array. The following solution reduces the amount of wasted space from Ω(n) to θ(n/w), where w is the number of bits in a computer "word". On my machine, that's 64.

This prose in this answer is in C comments, so you can copy-and-paste this answer verbatim and compile it with your C compiler.

*/

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>

/*

The problem is to support reading and writing values in an array in O(1) time along with bulk writes in which all values in the array are written at once in O(1) time. This is possible using a technique invented by Aho, Hopcroft and Ullman, as far as I know. I will present a version due to Gonzalo Navarro, "Constant-Time Array Initialization in Little Space".

The idea is to keep three metadata arrays along with the data array. We also keep two integers: unset, which is the last value used in a bulk write operation, and size, an approximation for the number of values that have been set since the last bulk write. At all times, the number of distinct values written since the last bulk write is between size and w * size.

The three metadata arrays describe information about blocks of w values in the data array. They are:

nth: nth[i] is the ith unique block to be written to since the last bulk write
inverse_nth: inverse_nth[i] is the order in in which the ith block of the array was written, counting from 0 at the last bulk write.
bitset: The jth bit of bitset[i] is 1 when the array cell numbered 64*i + j has been written to since the last bulk write.

bitset[i] and inverse_nth[i] are allowed to be invalid if i is not a member of the set {nth[0], nth[1], ... , nth[size-1]}. In other words, inverse_nth[i] and bitset[i] are valid if and only if inverse_nth[i] < size and nth[inverse_nth[i]] == i.

Rather than store three separate arrays of the same length, I chose to store one array, is_set, with three fields.

*/

typedef struct {
  int nth_;
  int inverse_nth_;
  uint64_t bitset_;
} IsSetCell;

typedef struct {
  int unset_;
  int size_;
  IsSetCell is_set_[];
} IsSetArray;

typedef struct {
  IsSetArray * metadata_;
  int payload_[];
} ResettableArray;

/*

To construct an array, we need a default value to return when the reading a value that has never been written to.

*/

ResettableArray * ConstructResettableArray(int n, int unset) {
  ResettableArray* result =
      malloc(offsetof(ResettableArray, payload_) + n * sizeof(int));
  if (!result) return NULL;
  n = (n + 63) / 64;
  result->metadata_ =
      malloc(offsetof(IsSetArray, is_set_) + n * sizeof(IsSetCell));
  if (!result->metadata_) {
    free(result);
    return NULL;
  }
  result->metadata_->size_ = 0;
  result->metadata_->unset_ = unset;
  return result;
}

void DestructResettableArray(ResettableArray * a) {
  if (a) free(a->metadata_);
  free(a);
}

/*

The bulk of the algorithm is in writing and reading the metadata. After IsSet() and Set() are defined (below), reading and writing the arrays is straightforward.

*/

bool IsSet(const IsSetArray * a, int i);
void Set(IsSetArray * a, int i);

int GetValue(const ResettableArray * a, int i) {
  if (!IsSet(a->metadata_, i)) return a->metadata_->unset_;
  return a->payload_[i];
}

void SetValue(ResettableArray * a, int i, int v) {
  a->payload_[i] = v;
  Set(a->metadata_, i);
}

void SetAllValues(ResettableArray * a, int v) {
  a->metadata_->unset_ = v;
}

/*

The complex part of reading and writing is the bidirectional relationship between inverse_nth and nth. If they point to each other at location i (is_set[is_set[i].inverse_nth].nth == i) then location i contains valid data that was written after the last bulk write, as long as is_set[i].inverse_nth < size.

*/

uint64_t OneBit(int i) {
  return UINT64_C(1) << i;
}

bool IsSet(const IsSetArray * a, int i) {
  const int cell = i/64, offset = i & 63;
  const int inverse_nth = a->is_set_[cell].inverse_nth_;
  return inverse_nth < a->size_ && a->is_set_[inverse_nth].nth_ == cell &&
         a->is_set_[cell].bitset_ & OneBit(offset);
}

void Set(IsSetArray * a, int i) {
  const int cell = i/64, offset = i & 63;
  const int inverse_nth = a->is_set_[cell].inverse_nth_;
  if (inverse_nth >= a->size_ || a->is_set_[inverse_nth].nth_ != cell) {
    a->is_set_[cell].inverse_nth_ = a->size_;
    a->is_set_[cell].bitset_ = 0;
    a->is_set_[a->size_].nth_ = cell;
    ++a->size_;
  }
  a->is_set_[cell].bitset_ |= OneBit(offset);
}

2017-01-17 21:25
by jbapple

The code never decreases size_, which I think must be wrong. Perhaps it should be zeroed in SetAllValues() - AnatolyVorobey 2018-07-13 06:08

Python example

class d:
    def __init__(self, l):
        self.len = l
        self.g_p = [i for i in range(self.len)]
        self.g_v = [0 for i in range(self.len)]
        self.g_s = self.len - 1
        self.g_c = 0  

    def getVal(self, i):
        if (i < 0 or i >= self.len):
            return

        if (self.g_p[i] <= self.g_s):
            return self.g_v[self.g_p[i]]

        return self.g_c

    def setVal(self, i, v):
        if (i < 0 or i >= self.len):
            return

        if (self.g_p[i] > self.g_s):
            self.g_s += 1

            self.g_p[self.g_s], self.g_p[i] = self.g_p[i], self.g_p[self.g_s]

        self.g_v[self.g_p[i]] = v

    def setAll(self, v):
        self.g_c = v
        self.g_s = -1

2014-06-11 09:57
by Roee

Regarding Timothy Jone's answer:

A problem with this approach is that eventually you'll run out of ids for timestamp, and might wrap around. If you chose a 64 bit value to store timestamps, then this gives you 18,446,744,073,709,551,616 insertions or setAlls before this happens. Depending on the expected use of the datastructure, an O(n) cleanup phase might be appropriate, or you could just throw an exception.

This is exactly the worst case scenario which make this solution O(n) too, and not O(1). This stucture, although saves a lot of potential O(n) insert operation, is still in O(n) effeciency.

2015-03-05 13:01
by Emliti

See my answer for a way to maintain constant time updates in the face of overflow - dfeuer 2015-03-06 01:51

@dfeuer can you please specify in words how you unbox the Int and Word values? This question is being asked around the world in interviews, and the expected answer by interviewers (as descibed by Timothy) is definitly incorrect, or at least incomplete. In any case, it is obvious that the answer above should be refined. Thanks in advance - Emliti 2015-03-06 13:31

That sounds like a pretty good reason to delete my answer altogether - dfeuer 2015-03-06 13:59

This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post - you can always comment on your own posts, and once you have sufficient reputation you will be able to comment on any post - Ian 2015-03-06 14:20

@Bluehound Because I ususally don't write here I cannot leave a comment under Timothy's answer. I know I have made an important remark so please don't delete it. If pereferable move this answer under Timothy's answer so anyone can see and comment. Thank yo - Emliti 2015-03-06 19:20

Are you familiar with the concept of amortization - dfeuer 2015-03-22 05:03

@dfeuer I am familiar, but there are 2 problems: 1. The question doesn't specify it (and you can see that nobody point this issue). 2. Even with "amortized" approach, you still have to update the memory capacity of all the array cells, which is linear in the size of the array. To understand the problem, lets assume that the index memory initially consists of 1 byte only. Don't confuse linearity of the array size and the linearity of input operation size, i.e. index of set/setAll operations. In any case Thimoty's answer is either incomplete or incorrect. May be the question itself is incorrect - Emliti 2015-03-23 11:31

@Emliti, I don't really understand you. See my answer for an approach to turning the amortized bounds to real-time ones - dfeuer 2015-03-23 11:39

@dfeuer You have to handle the index memory "clean-up" or widening the memory allocated for storing the index in a constant amortized time. If you choose to widen the index memory, then you have to prove that you get O(1) amortized time. But you have to widen this index for n cells of the array. For simplicity let's start with index of type byte. Then, after 255 you have to wident the type of the byte, say, to short (2 bytes) and so on. Can you specify in words how do you expand it in amortized way - Emliti 2015-03-23 17:06

Addition: It is not like expanding Arraylist length in java (which is O(1) amortized), because you have, again to expand it for n indexes. If you provide a clear example than your answer should be the correct one - Emliti 2015-03-23 17:07

@Emliti, I don't understand what you mean. I operate under the assumption that the index type is wide enough to accommodate the array length as a signed integer, and that the machine involved is a random access machine with a fixed pointer size. You can't really relax these restrictions very much before you end up in a context where you have to consider any sort of array access to be logarithmic - dfeuer 2015-03-23 17:19

@dfeuer, the assumption that the index type is "wide enough" is a problematic approach when analyzing the efficiency of a program. The question required to supply an algorithn with O(1) ,not amortized. That means, that even on the worst case, setAll operation shoud be achieved in constant time. Any other assumption contradicts the definition of this effieciency. Thats the essence of my initial comment. You cannot claim to someone that you have an algorithm of O(1), but there is a worst case scenario in which you have to go through all the n cells of an array.. - Emliti 2015-03-23 23:06

@Emliti, it also requires the ability to set and look up an individual element in O(1) time. In anything short of a RAM model, that is quite simply impossible. Amortization has nothing to do with it. The implementation in my answer never involves anything O(n) except for the initialization at the very beginning, which the OP does not mention - dfeuer 2015-03-23 23:44

To boil this down a bit more, you can't get O(1) arbitrary lookup and O(1) arbitrary set unless you are willing to place an upper bound on the total array size. Not by hook, not by crook, not no way, not no how. Once you've imposed that upper bound, you can select a matching index width, once and for all - dfeuer 2015-03-23 23:57

@dfeuer since the number of operations can be "infinite", any index width can be reached to the maximum value available. Note the length of the array can be 'fixed' to a constant size - the problem of running out of indexes still exists. Of course the index width can be "large enough" for the 'practical' usage, but per the mathematical definition of O(1) complexity, there is a problem, since either I have to allocate new memory every fixed amount of input, or go through all array cells for a cleanup. I think at this stage my point is clear - Emliti 2015-03-24 11:38

@Emliti, oh, so the problem is that you don't understand my answer at all. Read it again, and see how I deal with overflow - dfeuer 2015-03-24 11:42

@dfeuer your answer is indeed unclear to me. Can you specify in words how do you deal with it? For example I don't understand in your C code the setAll function: if (q->count == q->length) q->count = 0; How the order is preserved?Thank you in advance - Emliti 2015-03-24 13:07

@Emliti, you can think of my code as having two coroutines. One of them performs the lookups and updates. The other one "refreshes" the array. Each time someone calls setAll, it performs one step in the refresh process. Note that the comparison between the setAll time and the individual entry time is not done using the usual a<=b, but instead with a-b<=0. These are different in the presence of overflow. I don't test whether one time is before the other, but rather whether one lags the other by at most an amount depending on the width of the type - dfeuer 2015-03-24 13:27

Refreshing ensures that the lag is never too big - dfeuer 2015-03-24 13:28

@dfeuer your q->allT vec[i].time will eventually get overflowed.. - Emliti 2015-03-24 23:42

@Emliti, yes, it will. And that doesn't matter - dfeuer 2015-03-25 01:24

@dfeuer it does matter. to simplify the problem, look at the array of size 5, and make allT and vec[i].time as a byte. After every 256 operations you will get an overflow which you will have to handle. If you come up with the need to widen your operation index memory, then your solution is incorrect. Let m be the number of set/setAll operations.If you choose add memory, you will add m/256 cells which is O(m). If you choose a cleanup, then O(n) where n is the array size. The fact that unsigned long is 'extremely large' for practical reasons doesn't matter from mathematical point of view - Emliti 2015-03-25 10:04

@Emliti, nothing special happens on overflow. If you wish to discuss this more, you can often find me on FreeNode with this same user name - dfeuer 2015-03-25 11:31

I will leave you with one parting reminder: fixed-width integers under addition with overflow form a group (a ring, even, though we don't need that) - dfeuer 2015-03-25 12:23

@dfeuer if you could elaborate a little more on how the overflow doesn't affect the correcnes of (q->vec[i].time - q->allT) < 0) in your getval function, it would be grateful. IMO this is the most important part of the solution, isn't it? Thanks in advance - Emliti 2015-03-25 17:25

@Emilti it's a little tricky but the solution is right. Suppose q->length=5, and long is a byte. Because of the amortization done in setAll(), after every 5 calls to setAll() every q->vec[i].time is updated. As a result, q->vec[i].time in all the cells is never more than q->length==5 units away from q->allT, until the overflow happens. Then q->allT == 0, and (q->vec[i].time - q->allT) will be larger than 128, because q->vec[i].time was within 5 of 256 just before the overflow. And casting that to (long) will be <0. The casting of the difference to signed long is key to understanding - AnatolyVorobey 2018-07-12 18:03

Hi, thanks for your comment. Still no doubt that: 1)The accepted answer should be refined. 2) @dfeuer's answer , although seems correct, should be more detailed (especially the opposite direction proof) The current state of the accepted answer needs definitely to be improved, as interviewers will expect wrong / incomplete answer. I hope you agree with me on this point - Emliti 2018-07-14 11:42

This is my answer in Java (I'm not completly sure about the syntax). I did the set-functions synchronized in order to avoid a situation where changeT and defChangeT are equal.

Struct ValueSt{ 
  int value;
  Timestamp changeT;
}

Class MyDS{
  int default;
  Timestamp defChangeT;
  HashMap map;

  public MyDS(){
    map = new HashMap<int, ValueSt>();
  }

  public synchronized void mySet(Int key, int value){
    ValueSt vs = new ValueSt(value, Timestamp(System.current));
    map.put(key, vs);
  }

  public synchronized void setAll(int value){
    default = value;
    defChangeT = Timestamp(System.current));
  }

  public int myGet(int key){
    ValueSt vs = map.get(key);
    if(vs != null){
      if(vs.changeT > defChangeT)
        return vs.value;
      else
        return default;
    }
    return null;
  }
}

2018-04-10 14:05
by Chedva

-1

The correct solution in C#:

public sealed class SpecialDictionary<T, V>
{
    private Dictionary<T, Tuple<DateTime, V>> innerData;
    private Tuple<DateTime, V> setAllValue;
    private DateTime prevTime;

    public SpecialDictionary()
    {
        innerData = new Dictionary<T, Tuple<DateTime, V>>();
    }
    public void Set(T key, V value) => innerData[key] = new Tuple<DateTime, V>(GetTime(), value);
    public void SetAll(V value) => setAllValue = new Tuple<DateTime, V>(GetTime(), value);
    public V Get(T key)
    {
        Tuple<DateTime, V> tmpValue = innerData[key];
        if (setAllValue?.Item1 > tmpValue.Item1)
        {
            return setAllValue.Item2;
        }
        else
        {
            return tmpValue.Item2;
        }
    }
    private DateTime GetTime()
    {
        if (prevTime == null)
        {
            prevTime = DateTime.Now;

        }
        else
        {
            if (prevTime == DateTime.Now)
            {
                Thread.Sleep(1);
            }
            prevTime = DateTime.Now;
        }
        return prevTime;
    }
}

And the test:

static void Main(string[] args)
{
    SpecialDictionary<string, int> dict = new SpecialDictionary<string, int>();
    dict.Set("testA", 1);
    dict.Set("testB", 2);
    dict.Set("testC", 3);
    Console.WriteLine(dict.Get("testC"));
    dict.SetAll(4);
    dict.Set("testE", 5);
    Console.WriteLine(dict.Get("testC"));
    Console.WriteLine(dict.Get("testE"));
    Console.ReadKey();
}

2017-06-15 09:08
by shlatchz