4  Simulations

Much of what we do in the program and in statistical coding revolves around simulating experiments. This means, repeated a random experiment, potentially many times.

Below are some problems on these themes.

4.1 Questions

  1. In a bag, there are 5 red balls, 3 green balls, and 2 blue balls. Write an R function to randomly draw a ball from the bag.

    Code
    draw_ball <- function() 
    {
      bag <- c(rep("Red", 5), rep("Green", 3), rep("Blue", 2))
      return(sample(bag, 1))
    }


  2. Simulate a random walk of 1000 steps and plot the path. A random walk is such that \(x_0 = 0\), and \(x_t = x_{t-1} \pm 1\) with equal probability

    Code
    simulate_random_walk <- function(steps) 
    {
      walk <- numeric(length = steps)
      walk[1] <- 0
      for(t in 2:steps)
      {
        choices <- sample(c(-1, 1), 1)
        walk[t] <- walk[t-1] + choices
      }
      plot(walk, type = "l", main = "Random Walk", xlab = "Step", ylab = "Position")
    }
    simulate_random_walk(1000)


  3. Write a function that simulates rolling two fair dice n times and returns the proportion of times the sum of the dice is 7.

    Code
    roll_dice_simulation <- function(n) 
    {
      # Simulate rolling two dice n times
      dice1 <- sample(1:6, n, replace = TRUE)
      dice2 <- sample(1:6, n, replace = TRUE)
    
      # Calculate the sum of the two dice for each roll
      sum_of_dice <- dice1 + dice2
    
      # Count the number of times the sum is 7
      count_seven <- sum(sum_of_dice == 7)
    
      # Calculate the proportion of times the sum is 7
      proportion_seven <- count_seven / n
    
      return(proportion_seven)
    }
    
    # Example usage:
    roll_dice_simulation(n = 50)


  4. Write a function that simulates tossing a fair coin 20 times and returns the proportion of heads.

    Code
    coin_toss_proportion <- function() {
      tosses <- sample(c("H", "T"), size = 20, replace = TRUE)
      proportion_heads <- sum(tosses == "H") / 20
      return(proportion_heads)
    }
    
    # Example usage
    set.seed(123)  # For reproducibility
    coin_toss_proportion()


  5. Write function that simulates rolling two fair dice n = 100 times and returns the distribution of the sums of the dice. Plot a histogram of the distribution. Now change values of n, and see the change in histogram

    Code
    prop_diesum = function(n)
    { 
      #initialise the frequency table 
      diesum= rep(0 ,12 )
      for( i in 1:n)
      {
        #take a random sample from die
        die1  = sample(1:6 , size = 1)
        die2  = sample(1:6 , size = 1)
        outcome= die1+die2
    
        #add the outcome to corresponding frequency table
        diesum[outcome] = diesum[outcome] + 1
      }
      return(diesum)
    }
    
    #calculate frequencies for different n 
    ds1e2 = prop_diesum(100)
    ds1e3 = prop_diesum(1e3)
    ds1e4 = prop_diesum(1e4)
    
    # make plots (side-by-side)
    
    par(mfrow = c(1,3))
    barplot(ds1e2, main = "Frequency of Dice Sum (n = 100)", xlab = "Sum of Dice", ylab = "Frequency", col = "skyblue", names.arg = 1:12)
    
    barplot(ds1e3, main = "Frequency of Dice Sum (n = 1000)", xlab = "Sum of Dice", ylab = "Frequency", col = "lightgreen", names.arg = 1:12)
    
    barplot(ds1e4, main = "Frequency of Dice Sum (n = 10000)", xlab = "Sum of Dice", ylab = "Frequency", col = "salmon", names.arg = 1:12)


  6. Write R simulation function rolls() for a fair die is rolled n times that returns 1 if at least 1 of the 6 values never appears and 0 otherwise. Then find the average value by running this 1000 times.

    Code
    rolls = function(n)
    { 
      #initialise the frequency table 
      die = rep( 0 , 6)
      for( i in 1:n)
      {
        #take a random sample from die
        outcome = sample(1:6, size = 1) 
        #add the outcome to corresponding frequency table
        die[outcome] = die[outcome] + 1
      }
      #atleast one value never appeared
      if(min(die) == 0)
      {
        count = 1
      }
      else #all values appeared atleast ones 
      {
        count = 0
      }
      return(count)
    }
    
    
    #function to take average
    expectation = function(num)
    {
      store  = numeric(length = 1e3)
      for( i in 1:1e3)
      {
        store[i] = rolls(num)
      }
      return(mean(store))
    }
    
    #find the expectation 
    y = numeric(length = 50)
    for( i in 1:50)
    {
      y[i] = expectation(num = i)
    }
    
    #make the plot
    plot( x = 1:50  , y = y , main = "Probability vs no of rolls" , type = "o" , col = "blue" , ylab = "Expected Value" , xlab="Rolls")


  7. Zehaan and Aditya are two constantly bickering roommates who always fight for the only available desk in the room. The rule that they have devised is as follows:

    Each morning, they throw two dice, if the sum of the numbers on the resulting faces is a Prime Number then one of them gets the desk else the other. Since the odds are not equally stacked, they flip a coin before this to decide who get’s the prime number odds and who does not.

    Write an R function to determine who wins this daily struggle. Output: Zehaan or Aditya accordingly

    Code
    daily.desk <- function()
    {
      d1 <- sample(1:6, 1)
      d2 <- sample(1:6, 1)
    
      sum.dice <- d1 + d2
    
      coin <- sample(1:2, 1)
    
      if (coin == 1){
        # Zehaan Gets prime odds
        if (sum.dice %in% c(2,3,5,7,11)){
          print("Zehaan")
        }else{
          print("Aditya")
        }
      }else{
        # Aditya Gets prime odds
        if (sum.dice %in% c(2,3,5,7,11)){
          print("Aditya")
        }else{
          print("Zehaan")
        }
      }
    }
    daily.desk()


  8. Yash, being very lazy, decides to study for his exam on the last day. There are \(12\) chapters in total, but due to time constraints, he can only thoroughly study \(8\) of them, chosen randomly. In tomorrow’s exam, \(30\) questions will be asked. Each question has an independent \(0.08\) probability of coming from chapters \(1\) to \(10\) and a \(0.1\) probability of coming from chapters \(11\) and \(12\) each. Yash can only answer questions from the chapters he studied, with each correct answer scoring \(+1\). The passing percentage is \(70\%\). Write an \(\text{R}\) code to determine if Yash will pass the exam. Print “pass” if he does, and “fail” otherwise.

    Code
    total <- 12
    prob_questions <- c(rep(0.08, 10), 0.1, 0.1)
    questions <- sample(1:total, prob = prob_questions, replace = TRUE, size = 30)
    canSolve <- sample(1:total, size = 8)
    score = sum(questions %in% canSolve) / 30
    if(score >= 0.7){
      print("pass")
    } else{
      print("fail")
    }