Commit 61b6d655 authored by Sang-Hoon Kim's avatar Sang-Hoon Kim
Browse files

Draft PA1 handout

parents
# Prerequisites
*.d
# Object files
*.o
*.ko
*.obj
*.elf
# Linker output
*.ilk
*.map
*.exp
# Precompiled Headers
*.gch
*.pch
# Libraries
*.lib
*.a
*.la
*.lo
# Shared objects (inc. Windows DLLs)
*.dll
*.so
*.so.*
*.dylib
# Executables
*.exe
*.out
*.app
*.i*86
*.x86_64
*.hex
vm
# Debug files
*.dSYM/
*.su
*.idb
*.pdb
# Kernel Module Compile Results
*.mod*
*.cmd
.tmp_versions/
modules.order
Module.symvers
Mkfile.old
dkms.conf
# Project specific
pa1
TARGET = pa1
CFLAGS = -g
all: pa1
pa1: pa1.c
gcc $(CFLAGS) $^ -o $@
.PHONY: clean
clean:
rm -rf pa1 *.o pa1.dSYM
.PHONY: test-r
test-r: pa1 testcases/r-format
./$< testcases/r-format
.PHONY: pa1 test-shifts
test-shifts: pa1 testcases/shifts
./$< testcases/shifts
.PHONY: pa1 test-i
test-i: pa1 testcases/i-format
./$< testcases/i-format
.PHONY: test-all
test-all: test-r test-shifts test-i
## Project #1: MIPS Assembly to Machine Instruction Translator
### ***Due on 12am, October 20 (Wednesday)***
### Goal
Translate MIPS assembly code into corresponding MIPS machine instructions.
### Problem Specification
- Implement a MIPS assembly translator that translates MIPS assembly into MIPS machine code one line at a time. With the command parser of PA0, translate tokens into bits, and merge them to produce one 32-bit machine instruction.
- The framework gets a line of input from CLI, makes it to lowercase, and calls `parse_command()`. After getting the number of tokens and `tokens[]` as of PA0, the framework calls `translate()` function with them. Write your code in the function to translate the tokenized assembly and return a 32-bit MIPS machine instruction.
- The translator should support following MIPS assembly instructions.
| Name | Opcode / opcode + funct |
| ------ | ----------------------- |
| `add` | 0 + 0x20 |
| `addi` | 0x08 |
| `sub` | 0 + 0x22 |
| `and` | 0 + 0x24 |
| `andi` | 0x0c |
| `or` | 0 + 0x25 |
| `ori` | 0x0d |
| `nor` | 0 + 0x27 |
| `sll` | 0 + 0x00 |
| `srl` | 0 + 0x02 |
| `sra` | 0 + 0x03 |
| `lw` | 0x23 |
| `sw` | 0x2b |
| `beq` | 0x04 |
| `bne` | 0x05 |
- R-format instructions are inputted as follow:
```
add s0 t1 gp /* s0 <- t1 + gp */
sub s4 s1 zero /* s4 <- s1 + zero */
sll s0 s2 3 /* s0 <- s2 << 3. shift amount comes to the last */
sra s1 t0 -5 /* s1 <- t0 >> -5 w/ sign extension */
```
- I-format instructions are the similar
```
addi s1 s2 0x16 /* s1 <- 0x16(s2). Immeidate values and address offset
come to the last */
lw t0 s1 32 /* Load t0 with a word at 32(s1) (s1 + 32) */
```
- The machine has 32 registers and they are specified in the assembly as follow;
| Name | Number |
| ------ | ------ |
| zero | 0 |
| at | 1 |
| v0, v1 | 2-3 |
| a0-a3 | 4-7 |
| t0-t7 | 8-15 |
| s0-s7 | 16-23 |
| t8-t9 | 24-25 |
| k1-k2 | 26-27 |
| gp | 28 |
| sp | 29 |
| fp | 30 |
| ra | 31 |
- `shamt` and immediate values are inputted as either (1) decimal numbers without any prefix or (2) hexadecimal numbers with `0x` as its prefix. Followings are the examples
```
10 /* 10 */
-22 /* -22 */
0x1d /* 29 */
-0x42 /* -66 */
```
- Unspecified register and `shamt` parts should be all 0's. For example, `shamt` part should be 0b00000 for `add` instruction. Likewise, `rs` part should be 0b00000 for `sll` instruction.
### Example
```
*********************************************************
* >> SCE212 MIPS translator v0.01 << *
* *
* .---. *
* .--------. |___| *
* |.------.| |=. | *
* || >>_ || |-- | *
* |'------'| | | *
* ')______('~~|___| *
* *
* Spring 2022 *
*********************************************************
>> add t0 t1 t2
0x012a4020
>> addi sp sp -0x10
0x23bdfff0
>> sll t0 t1 10
0x00094280
```
- DO NOT ADD/MODIFY/REMOVE CODES IN THE SPECIFIED ZONES.
- You can submit up to 30 times for this PA.
### Hints/Tips
- You can use any standard C library functions such as `strlen` and `strcpy`. However, Windows-specific functions are banned and it will make a compile error on the server.
- *Bit masking* might be your PA-saver. Search the Internet for the concept and try to leverage them in your implementation. You will find bit shifting and bitwise AND/OR are very useful in implemeting this PA. Of course, you don't have to use them if you want to.
- The C syntax allows to use `0b` and `0x` prefix to directly specify a binary/hexadecimal number.
- `int x = 0b1010 + 0xdead0011;`
- Helpful functions:
- `strcmp/strncmp`: For matching commands and register names
- `strtol/strtoimax`: Converting decimal/hexadecimal numbers (regardless of sign) in string to corresponding long/int numbers
### Submission / Grading
- Use [PAsubmit](https://sslab.ajou.ac.kr/pasubmit) for submission
- 180 pts in total
- Source: pa1.c (150 pts)
- Will be tested with the testcase files under the `testcases` directory.
- Have a look at `Makefile` for testing your implementation with these testcase files.
- Document: *One PDF* document (30 pts)
- Must include all the followings;
- How do you translate the instructions
- How tokens and immediate values are translated, and combined into machine codes. Must include an example case of negative immediate values
- How do you translate the register names to corresponding register numbers
- ***Lesson learned***: What you've leared while doing this PA. Do not mention common facts that are discussed in the class.
- NO MORE THAN ***THREE*** PAGES
- Please, do not literally read C code nor paste a screenshot of it. If you cannot explain your implementation without showing it, it means that something is not right in the way of your code writing or code explanation.
- WILL NOT ANSWER THE QUESTIONS ABOUT THOSE ALREADY SPECIFIED ON THE HANDOUT.
/**********************************************************************
* Copyright (c) 2021-2022
* Sang-Hoon Kim <sanghoonkim@ajou.ac.kr>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTIABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
**********************************************************************/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <errno.h>
/* To avoid security error on Visual Studio */
#define _CRT_SECURE_NO_WARNINGS
#pragma warning(disable : 4996)
/*====================================================================*/
/* ****** DO NOT MODIFY ANYTHING FROM THIS LINE ****** */
#define MAX_NR_TOKENS 32 /* Maximum length of tokens in a command */
#define MAX_TOKEN_LEN 64 /* Maximum length of single token */
#define MAX_ASSEMBLY 256 /* Maximum length of assembly string */
typedef unsigned char bool;
#define true 1
#define false 0
/* ****** DO NOT MODIFY ANYTHING UP TO THIS LINE ****** */
/*====================================================================*/
/***********************************************************************
* translate()
*
* DESCRIPTION
* Translate assembly represented in @tokens[] into a MIPS instruction.
* This translate should support following 13 assembly commands
*
* - add
* - addi
* - sub
* - and
* - andi
* - or
* - ori
* - nor
* - lw
* - sw
* - sll
* - srl
* - sra
* - beq
* - bne
*
* RETURN VALUE
* Return a 32-bit MIPS instruction
*
*/
static unsigned int translate(int nr_tokens, char *tokens[])
{
/* TODO:
* This is an example MIPS instruction. You should change it accordingly.
*/
return 0x02324020;
}
/***********************************************************************
* parse_command()
*
* DESCRIPTION
* Parse @assembly, and put each assembly token into @tokens[] and the number
* of tokes into @nr_tokens. You may use this implemention or your own
* from PA0.
*
* A assembly token is defined as a string without any whitespace (i.e., space
* and tab in this programming assignment). For exmaple,
* command = " add t1 t2 s0 "
*
* then, nr_tokens = 4, and tokens is
* tokens[0] = "add"
* tokens[1] = "t0"
* tokens[2] = "t1"
* tokens[3] = "s0"
*
* You can assume that the characters in the input string are all lowercase
* for testing.
*
*
* RETURN VALUE
* Return 0 after filling in @nr_tokens and @tokens[] properly
*
*/
static int parse_command(char *assembly, int *nr_tokens, char *tokens[])
{
char *curr = assembly;
int token_started = false;
*nr_tokens = 0;
while (*curr != '\0') {
if (isspace(*curr)) {
*curr = '\0';
token_started = false;
} else {
if (!token_started) {
tokens[*nr_tokens] = curr;
*nr_tokens += 1;
token_started = true;
}
}
curr++;
}
return 0;
}
/*====================================================================*/
/* ****** DO NOT MODIFY ANYTHING BELOW THIS LINE ****** */
/***********************************************************************
* The main function of this program.
*/
int main(int argc, char * const argv[])
{
char assembly[MAX_ASSEMBLY] = { '\0' };
FILE *input = stdin;
if (argc > 1) {
input = fopen(argv[1], "r");
if (!input) {
fprintf(stderr, "No input file %s\n", argv[0]);
return EXIT_FAILURE;
}
}
if (input == stdin) {
printf("*********************************************************\n");
printf("* >> SCE212 MIPS translator v0.10 << *\n");
printf("* *\n");
printf("* .---. *\n");
printf("* .--------. |___| *\n");
printf("* |.------.| |=. | *\n");
printf("* || >>_ || |-- | *\n");
printf("* |'------'| | | *\n");
printf("* ')______('~~|___| *\n");
printf("* *\n");
printf("* Spring 2022 *\n");
printf("*********************************************************\n\n");
printf(">> ");
}
while (fgets(assembly, sizeof(assembly), input)) {
char *tokens[MAX_NR_TOKENS] = { NULL };
int nr_tokens = 0;
unsigned int machine_code;
for (size_t i = 0; i < strlen(assembly); i++) {
assembly[i] = tolower(assembly[i]);
}
if (parse_command(assembly, &nr_tokens, tokens) < 0)
continue;
machine_code = translate(nr_tokens, tokens);
fprintf(stderr, "0x%08x\n", machine_code);
if (input == stdin) printf(">> ");
}
if (input != stdin) fclose(input);
return EXIT_SUCCESS;
}
addi sp sp 17
addi sp sp 0x25
andi t0 t1 -0x10
ori k1 a2 -0x4bad
lw s0 s1 0x7ee8
sw s4 s1 -0x0072
bne t1 t2 512
beq zero at 0x2eef
add t0 t1 t2
sub t3 t4 t5
and s0 a0 a2
or s2 zero s2
nor t9 sp gp
sll t0 t1 10
srl s0 s1 2
sll s2 s2 0x1d
sra s4 sp 0x03
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment