You work on a subscription-based creator platform and have launched a new onboarding flow intended to increase the share of new users who start a free trial within 7 days. After launch, you discover assignment was intended to be 50/50 by user ID, but treatment ended up over-representing iOS users and higher-intent traffic from paid acquisition because of an implementation bug in one entry surface. The team still wants to know whether the launch had a real causal impact and whether the result is trustworthy enough to ship broadly.
How would you analyze this launch given the imperfect randomization, and how would you redesign or rerun the experiment if needed so that the final ship decision is statistically valid and operationally safe?