There are two sorted arrays A and B of size

mandnrespectively. Find the median of the two sorted arrays. The overall run time complexity should beO(log (m+n)).

**Note:**

Please head over to this MIT handout for a much better solution, although their solution does not deal with some special cases, which is easy to fix. Please consult Sophie’s solution which fixes these special cases easily. Although my solution below works, it is too complicated.

**Online Judge**

This problem is available at Online Judge. Head over there and it will judge your solution. Currently only able to compile C++ code. If you are using other languages, you can still verify your solution by looking at the judge’s test cases and its expected output.

**Solution:**

If you search this problem on Google, you will find tons of hits. However, most of them deal with the special case where m == n, and even so their code are filled with bugs. The CLRS book has this problem as exercise in section 9.3-8, however it also assumes the case where *m* == *n*. The only reliable solution I found on the web which deals with the generic case also seemed incorrect, as their definition of the median is the single middle element (although their approach of using binary search is pretty neat). According to the definition of the median, if (*m* + *n*) is even, then the median should be the mean of the two middle numbers.

If you read my previous post: Find the k-th Smallest Element in the Union of Two Sorted Arrays, you know that this problem is somewhat similar. In fact, the problem of finding the median of two sorted arrays when (*m* + *n*) is odd can be thought of solving the special case where k=(*m*+*n*)/2. Although we can still apply the finding k-th smallest algorithm twice to find the two middle numbers when (*m* + *n*) is even, it is no more a desirable solution due to inefficiency.

You might ask: Why not adapt the previous solution to this problem? After all, the previous algorithm solves a more general case. Well, I’ve tried that and I didn’t consider the previous solution is easily adaptable to this problem. The main reason is because when (*m* + *n*) is even, the two middle elements might be located in the same array. This complicates the algorithm and many special cases have to be dealt in a case by case basis.

Similar to finding the k-th smallest, the divide and conquer method is a natural approach to this problem. First, we choose A_{i} and B_{j} (the middle elements of A and B) where i and j are defined as m/2 and n/2. We made an observation that if A_{i} <= B_{j}, then the median must be somewhere between A_{i} and B_{j} (inclusive). Therefore, we could dispose a total of *i* elements from left of Ai and a total of n-j-1 elements to the right of B_{j}. Please take extra caution not to dispose A_{i} or B_{j}, as we might need two middle values to calculate the median (it might also be possible that the two middle values are both in the same array). The case where A_{i} > B_{j} is similar.

Two sorted arrays A and B. i is chosen as m/2 and j is chosen as n/2. A_{i} and B_{j} are middle elements of A and B. If A_{i} < B_{j}, then the median must be between A_{i} and B_{j} (inclusive). Similarly with the opposite.

The main idea illustrated above is mostly right, however there is one more important invariant we have to maintain. It is entirely possible that the number of elements being disposed from each array is different. Look at the example above: If A_{i} <= B_{j}, two elements to the left of A_{i} and three elements to the right of B_{j} are being disposed. Notice that this is no longer a valid sub-problem, as both sub-array’s median is no longer the original median.

Therefore, an important invariant we have to maintain is:

The number of elements being disposed from each array must be the same.

This could be easily achieved by choosing the number of elements to dispose from each array to be (**Warning**: The below condition fails to handle an edge case, for more details see the **EDIT** section below):

k = min(i, n-j-1) when A_{i}<= B_{j}. <---1(a)k = min(m-i-1, j) when A_{i}> B_{j}. <---1(b)

Figuring out how to subdivide the problem is actually the easy part. The hard part is figuring out the base case. (ie, when should we stop subdividing?)

It is obvious that when *m*=1 or *n*=1, you must treat it as a special base case, or else it would end up in an infinite loop. The hard part is reasoning why *m*=2 or *n*=2 requires special case handling as well. (Hint: The two middle elements might be in the same array.)

Finally, implementing the above idea turns out to be an extremely tricky coding exercise. Before looking at the solution below, try to challenge yourself by coding the algorithm.

If you have a more elegant code to this problem, I would love to hear from you!

**EDIT:**

Thanks to Algorist for being the first person who points out a bug. (For more details, read his comment). The bug is caused by some edge cases that are not handled in the base case.

Shortly after I fixed that bug, I discovered another edge case myself which my previous code failed to handle.

An example of one of the edge cases is:

A = { 1, 2, 4, 8, 9, 10 } B = { 3, 5, 6, 7 }

The above conditions ( **1(a)**, **1(b)** ) fails to handle the above edge case, which returns **5** as the median while the correct answer should be **5.5**.

The reason is because the number 5 is discarded in the first iteration, while it should be considered in the final evaluation step of the median. To resolve this edge case, we have to be careful not to discard the neighbor element when its size is even. Here are the corrected conditions ( **2(a)**,** 2(b)**, **2(c)**, **2(d)** ) for *k* which resolves this edge case.

k = min(i-1, n-j-1) when A_{i}<= B_{j}and m is even. <---2(a)k = min(i, n-j-1) when A_{i}<= B_{j}and m is odd. <---2(b)k = min(m-i-1, j-1) when A_{i}> B_{j}and n is even. <---2(c)k = min(m-i-1, j) when A_{i}> B_{j}and n is odd. <---2(d)

Below is the bug-free code after going through a lengthy rigorous testing of all possible edge cases. (Not for the faint of heart!)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
double findMedianBaseCase(int med, int C[], int n) { if (n == 1) return (med+C[0])/2.0; if (n % 2 == 0) { int a = C[n/2 - 1], b = C[n/2]; if (med <= a) return a; else if (med <= b) return med; else /* med > b */ return b; } else { int a = C[n/2 - 1], b = C[n/2], c = C[n/2 + 1]; if (med <= a) return (a+b) / 2.0; else if (med <= c) return (med+b) / 2.0; else /* med > c */ return (b+c) / 2.0; } } double findMedianBaseCase2(int med1, int med2, int C[], int n) { if (n % 2 == 0) { int a = (((n/2-2) >= 0) ? C[n/2 - 2] : INT_MIN); int b = C[n/2 - 1], c = C[n/2]; int d = (((n/2 + 1) <= n-1) ? C[n/2 + 1] : INT_MAX); if (med2 <= b) return (b+max(med2,a)) / 2.0; else if (med1 <= b) return (b+min(med2,c)) / 2.0; else if (med1 >= c) return (c+min(med1,d)) / 2.0; else if (med2 >= c) return (c+max(med1,b)) / 2.0; else /* a < med1 <= med2 < b */ return (med1+med2) / 2.0; } else { int a = C[n/2 - 1], b = C[n/2], c = C[n/2 + 1]; if (med1 >= b) return min(med1, c); else if (med2 <= b) return max(med2, a); else /* med1 < b < med2 */ return b; } } double findMedianSingleArray(int A[], int n) { assert(n > 0); return ((n%2 == 1) ? A[n/2] : (A[n/2-1]+A[n/2])/2.0); } double findMedianSortedArrays(int A[], int m, int B[], int n) { assert(m+n >= 1); if (m == 0) return findMedianSingleArray(B, n); else if (n == 0) return findMedianSingleArray(A, m); else if (m == 1) return findMedianBaseCase(A[0], B, n); else if (n == 1) return findMedianBaseCase(B[0], A, m); else if (m == 2) return findMedianBaseCase2(A[0], A[1], B, n); else if (n == 2) return findMedianBaseCase2(B[0], B[1], A, m); int i = m/2, j = n/2, k; if (A[i] <= B[j]) { k = ((m%2 == 0) ? min(i-1, n-j-1) : min(i, n-j-1)); assert(k > 0); return findMedianSortedArrays(A+k, m-k, B, n-k); } else { k = ((n%2 == 0) ? min(m-i-1, j-1) : min(m-i-1, j)); assert(k > 0); return findMedianSortedArrays(A, m-k, B+k, n-k); } } |

**EDIT2:**

A reader buried.shopno had managed to code the solution more elegantly! I especially like how medianOfThree and medianOfFour were implemented. For more details, read his comment below. Great job!

**Further thoughts:**

A reader nimin98 suggested that the base case can be handled by simply doing a direct merge. In other words, we have to merge the short array (containing either one or two elements) with the longer array (pick the four elements near the middle. Deciding which four is another tricky business because of multiple special cases). nimin98’s code has few bugs in the handling of base case.

In general, The above approaches (including mine) to handle the base case are not recommended due to tricky implementation. How about Binary Search? We can use binary search to find the correct position to insert elements from the shorter array into the longer array, thus completing the merge (You don’t have to **actually** insert it, recording its index should be suffice).

something is missing in code above, in the snippet below:

int i = m/2, j = n/2;

if (A[i] 0);

return findMedian(A+k, m-k, B, n-k);

} else {

int k = min(m-i-1, j);

assert(k > 0);

return findMedian(A, m-k, B+k, n-k);

}

the condition if (A[i] 0); is incomplete and also, ‘k’ is not defined in the call return findMedian(A+k, m-k, B, n-k);

0also, else if statement in

// Prerequisite: a must be less than or equal to b

double medianOfThree(int a, int b, int val) {

assert(a <= b);

if (val <= a)

return a;

else if (val b */

return b;

}

is incomplete

0Thanks for pointing that out. It’s fixed now.

+1In

int i = m/2, j = n/2;

if (A[i] 0);

return findMedian(A+k, m-k, B, n-k);

k is undefined and if statement is incomplete

0Very cool analysis! It is actually disappointing to see so many buggy solutions online, while this is a question I often see being asked in an interview.

Good job.

0Two more question to add to your queue :

1) Given two sorted positive integer arrays A[n] and B[n] , we define a set S = {(a,b) | a \in A and b \in B}. The value of such a pair is defined as Val(a,b) = a + b. Now we want to get the n pairs from S with largest values in O(n) time.

2) Consider there is an array with duplicates and you are given two numbers as input and u have to return the minimum distance between the two in the array with minimum complexity.

01.

At any point, starting from right, either Val(A[aIdx-1],B[bIdx]) or Val(A[aIdx], B[bIdx-1]), would foorm the largest val pair; offcourse after A[aIdx], B[bIdx], has been taken up.

0There are some bugs in this the program posted…….

For e.g. It doesn’t handles base case properly.

Arr1 = 2 elements (p, q)

Arr2 = even no. of elements(1,9,11,16,24,28,32,46)

Now p & q may occur anywhere in the list…..

For e.g.

p,q,1,9,11,16,24,28,32,46

1,9,p,q,11,16,24,28,32,46

1,9,p,11,q,16,24,28,32,46

1,9,11,p,q,16,24,28,32,46

1,9,11,p,16,q,24,28,32,46

1,9,11,16,p,q,24,28,32,46

1,9,11,16,p,24,q,28,32,46

1,9,11,16,24,p,q,28,32,46

1,9,11,16,24,p,28,q,32,46

1,9,11,16,24,28,p,q,32,46

1,9,11,16,24,28,32,46,p,q

Please check the logic that you have implemented works for this case. I have placed p & q knowingly around 11,16,24,28 in final array..

0You are absolutely correct. Congrats to Algorist for being the first reader who found the bug! I will give credit to you and correct my code in my post shortly. Thanks!

0Also, one thing that i haven’t understood is

k = min(i, n-j-1) when Ai Bj.

May be very tired that’s why not able to get the logic behind.. Please if anybody could explain this..

0Hey why did you deleted my comments??

0I would never delete my reader’s comments. The reason you are not seeing your comments appearing immediately is because it requires my manual approval. I will remove this restriction now, as the spam filter is working pretty good up to this point. Thanks for yoru patience.

0hey buddy,

I have few questions to send out to you. is that fine to post them directly here or mail them to you. If I want to mail you, can you provide me your mail id?

thanks!

0I have sent you a reply via email that you provided.

0How about this solution; basically, for the boundary case, simply merge the two tiny arrays, any special cases will be handled automatically.

int medianTwoArray(int *arrL, int *arrR, int sizeL, int sizeR) {

if (sizeL <= 2 || sizeR <= 2) {

if (sizeL <= sizeR)

return handleBoundaryCase(arrL, arrR, sizeL, sizeR);

else

return handleBoundaryCase(arrR, arrL, sizeR, sizeL);

}

int medL = sizeL / 2;

int medR = sizeR / 2;

if (medL <= medR)

return medianTwoArray(arrL+medL, arrR, sizeL-medL, sizeR-medL);

else

return medianTwoArray(arrL, arrR+medR, sizeL-medR, sizeR-medR);

}

int handleBoundaryCase(int *arrSmall, int *arrLarge, int sizeSmall, int sizeLarge) {

int medIdx = sizeLarge/2;

// merge two small array into one

int tempArr[7];

int idxSmall = 0;

int idxLarge = 0;

int idxTemp = 0;

int *newLargeArrStart = arrLarge + max(medIdx-2, 0);

memcpy(tempArr+sizeSmall, newLargeArrStart, sizeof(int)*5);

while (idxSmall < sizeSmall && idxLarge < min(5, sizeLarge)) {

if (arrSmall[idxSmall] < newLargeArrStart[idxLarge])

tempArr[idxTemp++] = arrSmall[idxSmall++];

else

tempArr[idxTemp++] = newLargeArrStart[idxLarge++];

}

if (idxSmall < sizeSmall)

memcpy(tempArr+idxTemp, arrSmall+idxSmall, sizeof(int)*(sizeSmall-idxSmall));

return tempArr[(sizeSmall+min(sizeLarge,5))/2];

}

0This method might degrade to linear complexity. The statement “simply merge the two tiny arrays” is flawed, as the other array might not be tiny. (Imagine if you start with m = 1 and n = 1000000). A better approach to handle the base case would be to use binary search to merge the small array into the larger array, which the median is then readily obtained. I already have the bug fix and will update my post as soon as I find the time.

0the maximum size of the tempArr is 7; simply use the media plus minus 2 of the longer array.

0Hi, 1337:

Could you provide code for finding k-th smallest element in an unsorted arrary? Googling can’t help me locate an answer with complete and correct code.

Thanks!

0Hi 1337,

Great works you post on here.

I code this problem here : http://ideone.com/FtqjM

It seems quite simpler than your provided solution. I’ve tested it for different random datasets ( 1 <= N, M <= 1000). Would you mind to have a look, and let me know if you find any possible bug in it. Thanks.

0Try the following test case, which results in infinite recursion:

n = 3

A = {1 2 3}

m = 2

B = {1 2}

I do see your code to have the potential to be more elegant. Looking forward to your bug-free code.

0Hey buried.shopno,

I read your code again and had verified your solution to be correct!

Sorry I missed the part where A’s size has to be less than B’s size (and you swap the array parameters to be passed in if not the case).

Your code is definitely more elegant than mine, so this post will feature your solution (will be updated later) and I will give full credit to you

0Hi 1337,

Thanks for your time to verify my program. I basically follow your approach, but code it in my way.

I wanted to avoid 2 set of base cases (for n == 1, m == 1, n == 2, m == 2), and so swap the function parameters based on array size. I had hard time to handle the base case of n == 2 and m > 2.

0Can you post you code here ? I cannot access the link

+4Your code is inaccessible on IDEone.com

+3Bravo, good job buried.shopno! Your code simplified the base case handling a lot. 1337 did a good job analyzing the recursion pattern.

Based on you two’s work, I made the following two further observations:

1. if you assume m <= n, we can prove through mathematical deduction that the number of elements excluded from A will always be less or equal to the number of elements that could be removed from B. So the min function calls in the recursion step can be skipped. We always pick i or m-i-1 as the number of elements to be removed.

2. Through mathematical deduction, we can also prove that A[i-1] or B[j-1] could be involved in the final calculation of median only when both m and n are even.

I have verified both the above two observations by modify buried.shopno's code.

0your observation 1 is great,

I also agree with 2, when only one is even, the total is odd, the median should be 1 number and the disposed is always lesser or greater then (m+n+1)/2.

Only when both is even the the number is (m+n)/2, which make the disposed be the rightmost of the left (m+n)/2，which is the two numbers needed by median.

I deducted this by calculating 4 situations which either m is even or n is even.

0code for your two observations.

+1Hi, 1337:

It seems that your bug-free code got an wrong answer with the following test case:

A = {1, 3, 4, 5}

B = {1, 3, 4, 6}

expected: 4

got: 3

0Thanks for your comment. (I think the expected answer should be 3.5, not 4)

I’ve just tested your input on the code above, it returns the correct answer 3.5.

Are you sure you are returning the answer as double?

0Yes, the output is 3.5, I’m sorry.

And I think I may have misunderstood the meaning of “median” in the problem. I thought it is the middle element of the result array after merging A and B. After all, it is more reasonable to return one element of the two array than a non-existing number, isn’t it?

0I’m using the definition of median according to Wikipedia: http://en.wikipedia.org/wiki/Median

If N is even, the median should be the mean of the two middle values.

0let me try mine…

0You need to return a double type. The median of {1,2,3,4} is 2.5.

0what i dont get in your solution is why is that you are returning an element only from array m…when it possible for it to exist in any array m or n !!

+2what i dont get in your solution is why is that you are returning an element only from array m…when it possible for it to exist in any array m or n !!

0my comments are meant for the code above ur comment !!

0Hello, first, thanks for your great posts and all of them are inspiring!

About this problem, I think maybe we could still use the MIT solution with a small modification. When n+m is odd, then it’s fine; otherwise, it actually returns the greater one of the two middle numbers and the smaller one will be max(B[j], A[i-1]). My code is here. Looking forward for your comments.

double findMedian(int A[], int B[], int l, int r, int nA, int nB) {

if (l > r) return findMedian(B, A, max(0, (nA+nB)/2-nA), min(nB, (nA+nB)/2), nB, nA);

int i = (l+r)/2;

int j = (nA+nB)/2 – i – 1;

if (j ≥ 0 && A[i] < B[j]) return findMedian(A, B, i+1, r, nA, nB);

else if (j < nB-1 && A[i] > B[j+1]) return findMedian(A, B, l, i-1, nA, nB);

else {

if ( (nA+nB)%2 == 1 ) return A[i];

else if (i > 0) return (A[i]+max(B[j], A[i-1]))/2.0;

else return (A[i]+B[j])/2.0;

}

}

+10@Sophie: would you please give the link to MIT solution that you referred to? It seems a lot simpler than original post’s idea that deal with many special cases. Thanks.

0http://www2.myoops.org/course_material/mit/NR/rdonlyres/Electrical-Engineering-and-Computer-Science/6-046JFall-2005/30C68118-E436-4FE3-8C79-6BAFBB07D935/0/ps9sol.pdf

Sorry. It is in the post, denoted as “the only reliable solution I found on the web”.

0Hi Sophie,

Thanks for your comment. I will look into it later, and will add this problem to online judge so you all can verify your solution soon.

0Hi Sophie,

This problem is added to the Online Judge. Feel free to head over and test your solution.

0Could you please upload the input/output files? I was getting TLE without any clue

Thanks.

0You can try submitting the solution without adding any code, and it will show you Wrong Answer. Then you could see the test cases which allows you to check your program. Let me know if you believe your program was correct but still getting TLE!

0Thanks for a good data set. After a small fix, I got it accepted. My code was getting TLE for a special case n == 0 or m == 0.

0“Accepted!” Thanks for posting it to online judge!

+2Hey Sophie, would you mind to share you complete code here (or, on ideone.com). I ran your above code, which was getting too many wrong answers. I’m confused, even though it seems you follow the same procedure as mentioned in MIT’s note.

Thanks in advance.

0Here’s Sophie’s amazing solution. (Sophie I hope you won’t mind me posting your solution here). I wanna update this post with her solution but haven’t got the time to do it yet. Feel free to add comment on the solution.

double findMedian(int A[], int B[], int l, int r, int nA, int nB) {

if (l>r) return findMedian(B, A, max(0, (nA+nB)/2-nA), min(nB,

(nA+nB)/2), nB, nA);

int i = (l+r)/2;

int j = (nA+nB)/2 – i – 1;

if (j>=0 && A[i] < B[j]) return findMedian(A, B, i+1, r, nA, nB);

else if (j<nB-1 && A[i] > B[j+1]) return findMedian(A, B, l, i-1, nA, nB);

else {

if ( (nA+nB)%2 == 1 ) return A[i];

else if (i>0) return (A[i]+max(B[j], A[i-1]))/2.0;

else return (A[i]+B[j])/2.0;

}

}

double findMedianSortedArrays(int A[], int n, int B[], int m) {

if (n<m) return findMedian(A, B, 0, n-1, n, m);

else return findMedian(B, A, 0, m-1, m, n);

}

+5Sorry for the belated reply. I was out of town for an interview. 1337c0d3r has posted my complete code. The only tricky part is that my function asks for two more parameters, l and r, and thus we need a helper function and set l and r as 0 and n-1 (m-1) as needed. Please let me know if it still doesn’t work for you.

Thanks 1337c0d3r for posting my code!!

+1Hi Sophie/Admin, I implemented Sophie’s idea. I had to put separate checks for the cases where one of the arrays is empty and the other has even length.

Could you please explain why is it working in Sophie’s case?

Example case

[5, 7] and []

Link to my code – http://ideone.com/vc4kPc

0“Accepted!” Thanks for posting it to online judge!

0Congrats Sophie!!!

Your solution is beautiful and concise. I will update this post by featuring your solution, giving full credits to you. Great job!

+11337, I had the same idea with Sophie, but I implemented the idea without recursion. I put the codes into your online test (may need to delete the comments before copy them into the test) and it was accepted!

Following is my code, and can you have a look at and give some comments? Also, can I get some credits too?

//Note: The element right before the target element after combining and sorting these two arrays is not the element right before the target in the current array.

double findMedianSortedArrays(int A1[], int n1, int A2[], int n2){//Assume both A1 and A2 are not empty

//or

if(n1 == 0 && n2 == 0) throw “No element in neither array!”;

int s1 = 0, e1 = n1-1, s2 = 0, e2 = n2-1;//[s1, e1] is the searching range for A1, and [s2, e2] is the searching range for A2. In each iteration of the loop below, the target element should be in either range if it is in.

int targetNum = (n1+n2)/2;//We are going to find the element in either array that is greater than or equal to targetNum((n1+n2)/2) elements in both arraies.

while(s1 <= e1 && s2 = A2[mid2]){

if(mid1+mid2 < targetNum) s2 = mid2 + 1;

else e1 = mid1 – 1;

}else{

if(mid1+mid2 e1){//The searching range for A1 is empty now. Assume the target valude is T. Then T >= A1[s1-1]=A1[e1] and T A2[t-1]) ? A1[e1] : A2[t-1];//

}else{

t = targetNum – s2;

T1 = A1[t];

T2 = (t==0 || A2[e2]> A1[t-1]) ? A2[e2] : A1[t-1];

}

return (n1+n2)%2 ? T1 : ((double) T1+T2)/2;

}

0Sophie, I had the same idea with yours. Can you have a look at the solution I posted and give some comments? Thanks.

0@Sophie

That’s for an awesome implementation of the algorithm and handles very well arrays of different sizes.

Small modifications to make this look simple and will improve performance by removing else statements.

double findMedian(int A[], int B[], int l, int r, int nA, int nB) {

if (l>r) return findMedian(B, A, max(0, (nA+nB)/2-nA), min(nB,(nA+nB)/2), nB, nA);

int i = (l+r)/2;

int j = (nA+nB)/2 – i – 1;

if (j>=0 && A[i] < B[j]) return findMedian(A, B, i+1, r, nA, nB);

if (j B[j+1]) return findMedian(A, B, l, i-1, nA, nB);

if ( (nA+nB)%2 == 1 ) return A[i];

if (i>0) return (A[i]+max(B[j], A[i-1]))/2.0;

return (A[i]+B[j])/2.0;

}

double findMedianSortedArrays(int A[], int n, int B[], int m) {

if (n<m) return findMedian(A, B, 0, n-1, n, m);

return findMedian(B, A, 0, m-1, m, n);

}

0Here is ideone link for it.

http://www.ideone.com/8VgdW

-1this is weird, I download your code, and change to a simple case:

and the answer of your code or Sophie’s method is:

6

which should be 5.

Am I doing anything wrong here?!

0this won’t work for the two arrays {3} and {3}

0@Sophie

Thanks for your elegant codes. It is simple and efficient. While debugging it, I found a minor bug within it. When input is like A={}, B={2, 3}, there might be array bound overflow. That is B[-1] will be evaluated. In visual studio, it seems that no exception will be thrown.

+1j maybe less than 0. so you need discuss this situation.

0j maybe less than 0. so you need discuss this situation

double findM(int A[],int B[],int An,int Bn,int l,int r)

{

if(l > r)

{

return findM(B,A,Bn,An,max(0,(An+Bn)/2-An),min(Bn-1,(An+Bn)/2));

}

int i = (l+r)/2;

int j = (An+Bn)/2-i-1;

if(j>= 0 && A[i] < B[j])

{

return findM(A,B,An,Bn,i+1,r);

}

else if(j B[j+1])

{

return findM(A,B,An,Bn,l,i-1);

}

else

{

if((An+Bn) % 2 == 0)

{

if(j0)

return (A[i] + max(A[i-1],B[j]))/2.0;

else

return (A[i] + B[j])/2.0;

}

else

{

return A[i];

}

}

}

0seems can’t pass current online judge now. Changed a little bit, and passed judge.

0seems code tag has issue.

public class Solution {

public double findMedianSortedArrays(int A[], int B[]) {

// Start typing your Java solution below

// DO NOT write main() function

int alength = A.length;

int blength = B.length;

if (alength == 0 ) {

if (blength % 2 == 1) {

return B[blength/2];

}

else {

return ( B[blength/2 – 1] + B[blength/2] ) / 2.0;

}

}

else if (blength == 0 ) {

if (alength % 2 == 1) {

return A[alength/2];

}

else {

return ( A[alength/2 – 1] + A[alength/2] ) / 2.0;

}

}

int left = 0;

int right = alength-1;

return findMedian(A, B, left, right, alength, blength);

}

public double findMedian(int A[], int B[], int l, int r, int nA, int nB) {

if (l>r) {

int left = 0;

int right = nB-1;

return findMedian(B, A, left, right, nB, nA);

}

int i = (l+r)/2;

int j = (nA+nB)/2-i-1;

boolean lvalid = false;

if (j <= nB – 1 ) {

lvalid = (j = B[j]);

}

else {

lvalid = (j =0 ) {

rvalid = (j >= nB – 1) || (A[i] = nB – 1);

}

if ( lvalid && !rvalid ) return findMedian(A, B, l, i-1, nA, nB);

else if (!lvalid && rvalid) return findMedian(A, B, i+1, r, nA, nB);

else {

if ( (nA+nB)%2 == 1 ) return A[i];

else if (i>0) {

if (j<0) {

return (A[i]+A[i-1])/2.0;

}

else {

return (A[i]+Math.max(B[j], A[i-1]))/2.0;

}

}

else return (A[i]+B[j])/2.0;

}

}

}

-1still can’t post right coe. above code is not right, pls ignore.

0There is a small bug in the code, when

int A[] = {1, 2}

int B[] = {}

this case will make the j equal to -1, Sophie’s solution doesn’t check this.

0There is no need to apply the finding k-th smallest algorithm twice to find the two middle numbers when (m + n) is even.

Think this way: use find kth smallest alg to find the (m+n)/2 value,let’s say it is A[i]

B[j-1]<A[i]<B[j]

i+j-1+2=(m+n)/2

The next middle number (m+n)/2+1 is the value min(A[i+1], B[j])

0double findKthElement(int A[], int m, int B[], int n, int k, int kplus) {

if (0 == m)

return (B[k-1]+B[kplus-1])/(double) 2;

if (0 == n)

return (A[k-1]+A[kplus-1])/(double) 2;

if (k>2)

{

int i= ((k-3)>>1) >1) : m-1;

int j=k-3-i;

if (j>=n)

{

j=n-1;

i=k-3-j;

}

if (A[i]>=B[j])

{

int bj_1 = j+1<n ? B[j+1] : INT_MAX;

if (A[i]<=bj_1)

return findKthElement(A+i+1,m-i-1,B+j+1,n-j-1,1,kplus-k+1);

else

return findKthElement(A,m,B+j+1,n-j-1,k-j-1,kplus-j-1);

}

else

{

int ai_1 = i+1<m ? A[i+1] : INT_MAX;

if (B[j]<=ai_1)

return findKthElement(A+i+1,m-i-1,B+j+1,n-j-1,1,kplus-k+1);

else

return findKthElement(A+i+1,m-i-1,B,n,k-i-1,kplus-i-1);

}

}

else

{

int kv,kplusv,up=0,down=0;

int target = kplus;

while (target–)

{

int i=(up<m)?A[up]:INT_MAX;

int j=(down<n)?B[down]:INT_MAX;

kv=kplusv;

if (i>1)+1,((m+n)>>1)+1);

else

return findKthElement(A,m,B,n,((m+n)>>1),((m+n)>>1)+1);

}

0Hi Sophie,

Can you explain how did you arrive at the below wquation

if (l > r) return findMedian(B, A, max(0, (nA+nB)/2-nA), min(nB, (nA+nB)/2), nB, nA);

0Check out the pdf I attached in the posts. They have detailed explanation. It has been a while. Or, try some real numbers, then you can have some sense.

0I’ve seen another popular question related: find the median of infinite stream of number.

I guess it’s an open question, anyone can please share some good ideas? Thanks!

0“The only reliable solution I found on the web which deals with the generic case also

seemed incorrect“…..May I know what is incorrect there?

0Cant this be solved easily using the same algorithm that was used in http://www.leetcode.com/2010/10/searching-2d-sorted-matrix-part-ii.html?

For median, we could find Kth element (for odd) or Find K, (K+1) nth element for even and calculate median. Right?

0I use Binary search algorithm to find the median. Search the median in one array first. If it fails , try the second array.

Suppose there are two arrays a[m] and b[n].

If a[i] is the median, then it should satisfy some inequality like b[j]<=a[i]<=b[j+1] and i+j + (1or 2 ) = (m+n)/2.

There are lots of edge cases to consider. I also tested for many cases.

double findMedianSortedArrays(int A[], int m, int B[], int n) {

if(!m && !n) return 0;

if(!m){

if(n%2) return B[n/2];

else return 0.5*(B[n/2-1]+B[n/2]);

}

if(!n){

if(m%2) return A[m/2];

else return 0.5*(A[m/2-1]+A[m/2]);

}

if(m==1 && n==1) return 0.5*(A[0]+B[0]);

if(m==1) {

if(n%2){

if(A[0]<=B[n/2-1]) return 0.5*(B[n/2]+B[n/2-1]);

else if(A[0] <=B[n/2+1]) return 0.5*(A[0]+B[n/2]);

else return 0.5*(B[n/2+1]+B[n/2]);

}

else {

if(A[0]<=B[n/2-1]) return B[n/2-1];

else if (A[0]<=B[n/2]) return A[0];

else return B[n/2];

}

}

if(n==1) return findMedianSortedArrays(B,1, A, m);

if(n==2 && m==2) return 0.5*(min(A[1],B[1])+ max(A[0],B[0]));

double result=search(A, m, B, n);

if(result1){

i=(s+e)/2;

if(i>m-1) break;

j=(m+n)/2-i-(odd?1:2);

while(jA[i]){

if(odd) return A[(n+m)/2];

else return 0.5*(A[(n+m)/2-1]+A[(n+m)/2]);

}

e=i;

i=(s+e)/2;

j=(m+n)/2-i-(odd?1:2);

}

if((j<=n-1 && (A[i]<B[(jn-1){

s=i;

}

else if(j+1>=n || (j+1<n && A[i]<=B[(j+1=n) return 0.5*(A[i]+A[i+1]);

else return 0.5*(A[i]+ (i+1<m?min(A[i+1],B[j+1]):B[j+1]));

}

}

else e=i;

}

if(e-s==1){

for(i=s; in-1 || j=B[j] && (j+1<=n-1? A[i]=n) return 0.5*(A[i]+A[i+1]);

else return 0.5*(A[i]+ (i+1<m?min(A[i+1],B[j+1]):B[j+1]));

}

}

}

}

return -100000.0;

}

0Hi, first, Sophie, thank you for your algorithm, and thank you 1337c0d3r for creating this site, it is very helpful for interview prep. I do have one question about Sophie’s algorithm. I looked at the MIT site, and the algorithm makes good sense to me, except for this part:

(B, A, max(0, (nA+nB)/2-nA), min(nB, (nA+nB)/2), nB, nA); Why do we need these exact initial boundaries? thank you.

0I think I got the answer:

a.length = l

b.length = m

l + m = n

unionMedian(A, B, max(0, n/2 – m), min(l, n/2)) //why? See below

So since we will be searching in A for i, we put an upper bound on i by either the length of A (we don’t want an out

of bounds error) or n/2 (because we know the median of the merged array cannot possibly be past A[n/2]).

As for the lower bound, we can reason as follows: we want either 0, or if B is smaller than n/2, we will need some elements in

A to make up the distance to n/2; the exact amount of elements we need from A is the difference between n/2 and m, and since those elements

cannot be the median of the merged array, we can discard them initially.

+1I found an interesting article like this only .But was a bit easy to understand..

http://www.malloc.co/algorithms/finding-the-median-of-two-sorted-arrays/

+1Hello guys looking at this interesting problem.

Here is the my java code that passes all tests in online judge.

It does not have specific code for corner cases, but I suppose those cases are handled automatically.

I firstly search the median in A using a binary search, then if I can’t find it in A, I look for the median in B.

0One comment in the code is not right:

//if m+n is even, then the median is the average of (m+n)/2 and (m+n)/2 + 1

that should have been:

//if m+n is even, then the median is the average of (m+n)/2 and (m+n)/2 – 1

0Is this time complexity O(log(m+n))? I think your solution should O(logm + logn) = O(log(m*n)).

0Hui Shu, I like your solution! Straightforward and do not need to deal with corner cases which make code messy. This method should be able to used in finding Kth smallest number issue. The code given now is faulty.

0recursive version: Hui Shu’s get method is very useful

public double findMedianSortedArrays(int[] A, int[] B) {

// Start typing your Java solution below

// DO NOT write main() function

return medianSearch(A, B, 0, A.length-1);

}

public double medianSearch(int[] A, int[] B, int left, int right ){

if(left > right){

return medianSearch(B, A, 0, B.length-1);

}

int i = (left+right)/2;

int j = (A.length+B.length)/2-i;

if(get(A, i) get(B, j)){

return medianSearch(A,B, left, i-1);

}

if((A.length + B.length)%2 == 0) {

return (get(A,i)+Math.max(get(A,i-1), get(B, j-1)))/2.0;

}else{

return get(A,i);

}

}

public int get(int[] a, int i){

if(i = a.length){

return Integer.MAX_VALUE;

} else{

return a[i];

}

}

0Repost the solution as first one doesn’t work well.

public double findMedianSortedArrays(int[] A, int[] B) {

// Start typing your Java solution below

// DO NOT write main() function

return medianSearch(A, B, 0, A.length-1);

}

public double medianSearch(int[] A, int[] B, int left, int right ){

if(left > right){

return medianSearch(B, A, 0, B.length-1);

}

int i = (left+right)/2;

int j = (A.length+B.length)/2-i;

if(get(A, i) get(B, j)){

return medianSearch(A,B, left, i-1);

}

if((A.length + B.length)%2 == 0) {

return (get(A,i)+Math.max(get(A,i-1), get(B, j-1)))/2.0;

}else{

return get(A,i);

}

}

public int get(int[] a, int i){

if(i = a.length){

return Integer.MAX_VALUE;

} else{

return a[i];

}

}

0Isnt complexity just log n rather than log(m+n). I have not checked your code but rather MIT handout

0Could you explain why the average case complexity should be O(log(m+n))?

The below is my analysis:

case 1: when n >> m:

basically it finish disposing the shorter array then the larger one. so:

2*log(m) + log(n-m) ~ log(n)

case 2: when n and m are similar (let’s say m < n):

2*log(m) + log(n-m) ~ 2*log(m)

0Pingback: 二分查找法的实现和应用(进阶篇) « 成人免费资源三级分享网站

It seems that case 2(a) — 2(d) can be combined to one case:

int mid1 = (m – 1) / 2, mid2 = (n – 1) / 2;

k = min(mid1, mid2).

The following code passes on the cases:

class Solution {

public:

double minTwoAverage(int a[])

{

for(int i = 1; i <= 2; i++)

for(int j = 0; j < 4 – i; j++)

if(a[j] < a[j + 1])

swap(a[j], a[j + 1]);

return (double(a[2] + a[3])) / 2;

}

double baseCase(int A[], int m)

{

if((m & 1) == 0)

return (double(A[(m – 1) / 2] + A[(m – 1) / 2 + 1])) / 2;

return (double(A[(m – 1) / 2]));

}

double baseCaseOne(int A[], int m, int val)

{

int mid = (m – 1) / 2;

if((m & 1) == 0)

{

if(A[mid] <= val)

return min(A[mid + 1], val);

return A[mid];

}

if(m == 1)

return (double(A[0] + val)) / 2;

if(A[mid] <= val)

return (double(A[mid] + min(A[mid + 1], val))) / 2;

return (double(A[mid] + max(A[mid – 1], val))) / 2;

}

double baseCaseTwo(int A[], int m, int B[], int n)

{

int mid1 = (m – 1) / 2, mid2 = (n – 1) / 2;

if(m == 2)

return (double(max(A[mid1], B[mid2]) + min(A[mid1 + 1], B[mid2 + 1]))) / 2;

if((m & 1) == 0)

{

if(A[mid1] <= B[mid2])

{

int a[4] = {A[mid1 + 1], A[mid1 + 2], B[mid2], B[mid2 + 1]};

return minTwoAverage(a);

}

if(A[mid1] <= B[mid2 + 1])

return (double(A[mid1] + min(A[mid1 + 1], B[mid2 + 1]))) / 2;

return (double(A[mid1] + max(A[mid1 – 1], B[mid2 + 1]))) / 2;

}

if(A[mid1] B[mid2 + 1])

return max(A[mid1 – 1], B[mid2 + 1]);

return A[mid1];

}

double findMedianSortedArrays(int A[], int m, int B[], int n) {

if(!m)

return baseCase(B, n);

if(!n)

return baseCase(A, m);

if(m == 1)

return baseCaseOne(B, n, A[0]);

if(n == 1)

return baseCaseOne(A, m, B[0]);

if(m == 2)

return baseCaseTwo(B, n, A, m);

if(n == 2)

return baseCaseTwo(A, m, B, n);

int mid1 = (m – 1) / 2, mid2 = (n – 1) / 2;

int k = min(mid1, mid2);

if(A[mid1] >= B[mid2])

return findMedianSortedArrays(A, m – k, B + k, n – k);

return findMedianSortedArrays(A + k, m – k, B, n – k);

}

};

0Pingback: Ider – 二分查找法的实现和应用(进阶篇)

I think this could be done using ur previous post

http://leetcode.com/2011/01/find-k-th-smallest-element-in-union-of.html

And you don need to call twice or handling many cases

in case of m+n is even

we need to return average of k and k+1 element (where k will be (m+n)/2, definition of median)

if (Bj_1 < Ai && Ai Bj ? Bj : Ai.next; (probably one more condition to check Ai.next exist or not )

return (kth + kthnext)/2

}

let me know if something more needed.

Please note that we are not calling the function twice so it will not be inefficient .

0ignore the previous it was typo

if (Bj_1 < Ai && Ai Bj ? Bj : Ai+1 (probably one more condition to check Ai+1 exist or not )

return (kth + kthnext)/2

0Can you explain how the complexity of sophie’s solutions in log(n+m). I think its

logm + logn.

0actually you can easily adapt your kth algorithm here. if m + n is even, you can easily get the next value after comparing the value of i+1 and j or j+1 and i.

0Hey one quick question, why are you adding k to your arrays A and B? How did it compile?

return findMedianSortedArrays(

A+k, m-k, B, n-k);return findMedianSortedArrays(A, m-k,

B+k, n-k);0Same question here….feel confused

0The following code passes the online test:

+3Pingback: read.guoruEi » Blog Archive » 二分查找法的实现和应用(进阶篇)

Seems Sophie’s method failed this test cases:

int[] A = {1, 2, 3, 4, 5};

int[] B = {6, 7, 8, 9, 10, 11, 12, 13, 14};

0In order to use the k-th element approach, you just need to apply once for k=(m+n)/2. when m+n is even, the k+1 -th can be found in constant time.

Say k-th element is j in array A. the k+1 -th element can only be j+1 in array A or k-j-1 -th element in array B, whichever is smaller. We don’t need to apply the same k-th element approach again. And this solves the base cases and corner cases problem.

0Hey guys, is there any AC code of binary search version. I’ve tried

solve this problem in binary search way, but it seems a lot of boundary case.

Lots of failed cases is caused by can not find valid A[i],B[j] and B[j+1] let B[j]<=A[i]<=B[j+1].

I am very confused now, can some one help me, thanks a lot.

Besides, the solution of divide and conquer version works just well.

0The code below is based on the same idea:

compare the medians of two sorted array each step. After the comparison we can get ride of half length of the short array (such as B) in both A and B, so we can recursively call the same method. In log(B.length) steps, it will reach the edge case.

The code in Java below isn’t that clean, but passed both small and large tests and should show the idea.

0The code below is based on the same idea:

compare the medians of two sorted array each step. After the comparison we can get ride of half length of the short array (such as B) in both A and B, so we can recursively call the same method. In log(B.length) steps, it will reach the edge case.

The code in Java below isn’t that clean, but passed both small and large tests and should show the idea.

00The problem is:

while the number of elements being disposed from each array must be the same, can you still ensure the time complexity being log(m+n)?

+1while(lo=n-1) {ans=1;lo=0;hi=n-1;mid=(lo+hi)/2;}

j=n-mid-1;

if(!ans)

{

if(ar1[mid]>=ar2[j] && ar1[mid]<ar2[j+1])

{

cout<ar2[j] && ar1[mid]>ar2[j+1])

hi=mid-1;

else if(ar1[mid]<ar2[j] && ar1[mid]=ar2[j] && ar2[mid]<ar2[j+1])

{

cout<ar2[j] && ar2[mid]>ar2[j+1])

hi=mid-1;

else if(ar2[mid]<ar2[j] && ar2[mid]<ar2[j+1])

lo=mid+1;

}

}

0This is just a special case of case 2 in the “Find k-th smallest element”. k = [m+n]/2 (n<=k<=m).

0The time complexity should be O(log(min(m,n))) instead of O(log(m+n)).

0The following code has passed the online judge. The idea is that for the case when m+n is even,

simply remove the largest element to m+n is odd. The advantage of m+n is odd is the the median will belongs to one of the sorted array. As a result, there is fewer corner case.

However, there is still one corner case when A has only one element. But it is pretty simple.

0The code above is not complete

0Sorry I have some issues with the html editor in leetcode.

here is plain text. Anyone know how to delete a reply?

class Solution {

public:

double findMedianSortedArrays(int A[], int m, int B[], int n) {

// Start typing your C/C++ solution below

// DO NOT write int main() function

if (m == 0) {

if (n%2 == 0) {

return (B[n/2] + B[(n/2)-1])/2.0;

} else {

return B[n/2];

}

} else if (n == 0) {

return findMedianSortedArrays(B, n, A, m);

}

if ( (m + n) % 2 == 0) { // for when there are even number of elements.

// Remove the largest element from the array, calculate the median

double value;

if (A[m-1] > B[n-1]) {

value = findMedianSortedArrays(A, 0, m-1, B, 0, n);

} else {

value = findMedianSortedArrays(A, 0, m, B, 0, n-1);

}

// count total number of elements that ((m + n) / 2)) {

return value;

} else {

double minv = INT_MAX;

if ((auit – A < m) && (*auit <= minv)) minv = *auit;

if ((buit – B < n) && (*buit <= minv)) minv = *buit;

return value / 2.0 + minv / 2.0;

}

}

// for odd number of elements, median is one element in the sorted array.

return findMedianSortedArrays(A, 0, m, B, 0, n);

}

double get(int B[], int mid, int rIdx, int rEle) {

if (mid B[Bmid]) || (Be – Bs <= 1)) {

return findMedianSortedArrays(B, Bs, Be, A, As, Ae);

}

int shift = min(Amid – As, Be – Bmid);

return findMedianSortedArrays(A, As + shift, Ae, B, Bs, Be – shift);

}

};

0You can’t make even numbers to odd numbers. When the number of numbers are even, the median is supposed to be the average of middle two numbers.

0why the median of {1,2,3,4} is 2.5, instead of 2/3 ? The median is not average– it is a number

0I am thinking using finding kth order statistic of two sorted seems easier to understand, below is my code.

0I do not know why some part of the codes is missing…

below is the corrected one

0I actually had this question on a Google interview, and flubbed it. Afterwards, of course, I came home and solved it. If memory serves me correctly, The solution I had was O(log(min(m,n))), not O(log(m + n)). The argument for my complexity is simple: suppose n = 1. Then the median is adjusted up or down depending on whether the single element of the n set is greater or lessor than the median of the m set, independent of m.

0Charming, self-explaining, tidy and implementation-easy Java code for the median of two sorted array problem. It’s based on Hui Su’s and Benben’s idea/code.

The basic idea is the same with the MIT handout and to compare candidate in A[] with its corresponding in B[]. I was amazed that the introduction of the valueAt(int[] arr, int i) function handles everything for us.

-2Working code:

0Here is a simpler solution without recursion. O(log(m+n)).

0Lines near a_i and b_j are messed up. Don’t know how to get them displayed correctly:(

0check http://discuss.leetcode.com/questions/142/median-of-two-sorted-arrays?page=1&focusedAnswerId=2695#2695 for correct code:)

0@1337c0d3r

I am new to this old thread, but would like to share my opinion regarding your following comment, for your critique:

“If you read my previous post: Find the k-th Smallest Element in the Union of Two Sorted Arrays, you know that this problem is somewhat similar. In fact, the problem of finding the median of two sorted arrays when (m + n) is odd can be thought of solving the special case where k=(m+n)/2. Although we can still apply the finding k-th smallest algorithm twice to find the two middle numbers when (m + n) is even, it is no more a desirable solution due to inefficiency.”

I don’t think we have to call the k-th smallest algorithm twice. Once the k-th smallest is found, as long as we record the array (A or B) and index in that array which the k-th smallest refers to, which is trivial, the (k+1)-th element of A and B can be found in constant time. As such, one only needs to add a simple wrapper method to the k-th smallest algorithm class to get the median to handle the odd/even cases.

Personally I still favor the k-th smallest algorithm more, since it provides a more general solution, and also less cryptic.

+2Pingback: Find Median of Two Sorted Arrays | Little House

0I highly doubt if the codes could get any simpler.

0my solution is to binary search the solution space and try to binary approximate the answer, and the algorithm complexity is O(log(MAX_INT)*(log(n)+log(m)))

0Pingback: (Leetcode) Median of Two Sorted Arrays | Changhaz's Codeplay

Begin with ar1 and ar2, and let m1 and m2 be the 2 medians respectively and let M be actual median.

M will lie between m1 and m2. If m1 & gt;

m2 => M lies between m2 to m1, so all elements less than m1 but

more than m2, hence first half of ar1 and 2nd half of ar2. Likewise for

m2>m1. and go on until number of elements are 2. Here is the post – http://k2code.blogspot.in/2011/09/median-of-2-sorted-arrays.html

0Pingback: LeetCode: Median of 2 sorted arrays | Jerry's Blog

Hi – This is a great solution. I was viewing the code, but have problem understanding how you came up with the logic inside:

findMedianBaseCase()

findMedianBaseCase2()

How do you figure out the formula to figure out the median in these boundary cases?

0Why we need to dispose equal number of elements both arrays?

My understanding is if we keep disposing unequal number from both arrays which essentially means that the number of elements which fall into left side of median and right side of median are unequal. If we keep doing this we will end up with a number to which the number of elements on the left side is not same as that on the right side and hence the number is not the median. Hope my understanding is correct

0We probably can use “elimination” concept here to resolve the problem. The goal is to “eliminate” about (m+n)/2 smaller elements.

If the element m/2 is less than the element n/2, the first half of the first array (m/2 elements) can be “eliminated”. The target is in the second half of the first array and first half of the second array (this concept is from the original post). The new sub-problem becomes “eliminating (m+n)/2 – m/2 elements from the two new half arrays”. That creates a recursion pattern. It is a modified binary search.

Same elimination concept probably can apply to the similar problem: “Find the k-th Smallest Element in the Union of Two Sorted Arrays” that should give a 2*lg(k) complexity.

0Pingback: Mediam of two sorted arrays | akajj

Pingback: LeetCode Median of Two Sorted Arrays

Pingback: LeetCode OJ (C#) – Median of Two Sorted Arrays | miafish

Hey 1337 coder,

Must say you are really a BigB for algo. Well, I am a noob in coding.

I have a little bit doubts (compiler error) in 2 of my codes . so can you provide me your email id so that i can make it clear with you?

Thanks

01337 coder…

plzzz reply as i really need your help in my code.

0