Then Notes 隨筆

Good Taste in Code – Linus Torvalds 對於程式碼的品味

Introduction

在 2016 年的 TED 演講 The mind behind Linux 中,Linus Torvalds 談到了程式碼的「品味」,並在投影片上以單向連結串列(singly linked list)的兩種實作方式來作說明。

以下是兩種實作的虛擬碼(pseudocode),第一種實作方式非常的直觀,我想也是我們大多數人的寫法。需要用到一個 if 條件來判斷要移除的節點是否是此串列的第一項,因為要移除的節點位置在首項還是在中間,會有不同處理方式。

Bad Taste

// Bad taste
remove_list_entry(entry) {
    prev = NULL;
    walk = head;

    // Walk the list
    while (walk != entry) {
        prev = walk;
        walk = walk->next;
    }

    // Remove the entry by updating the
    // head or the previous entry
    if (!prev)
        head = entry->next;
    else
        prev->next = entry->next;
}

要注意的是,這段虛擬碼著重在移除節點的思維方式,不討論記憶體釋放、串列為空或節點不存在等問題。

如果要移除的節點是第一個,那我們需要把 head 指到下一個節點,也就是 head = entry->next;反之若在中間,則需要將所移除節點的上一個節點指到下一個去,意即 prev->next = entry->next

Good Taste

而第二種實作方式,則完全不需要 if 條件判斷,不會有例外狀況的產生,我們來看程式碼:

// Good taste
remove_list_entry(entry) {
    // The "indirect" pointer points to the
    // *address* of the thing we'll update
    indirect = &head;

    // Walk the list, looking for the thing that
    // points to the entry we want to remove
    while ((*indirect) != entry)
        indirect = &(*indirect)->next;

    // ..and just remove it
    *indirect = entry->next;
}

畢竟是虛擬碼,又涉及指標操作,所以可能比較難讀懂。所以我改寫成以下能動的版本,完整程式則附於文末。

void remove_list_entry(List **head, List *entry) {
    // The "indirect" pointer points to the
    // *address* of the thing we'll update
    List **indirect = head;

    // Walk the list, looking for the thing that
    // points to the entry we want to remove
    while ((*indirect) != entry)
        indirect = &(*indirect)->next;

    // ..and just remove it
    *indirect = entry->next;
}

這裡用到的是指標的指標(pointer to pointer),再經過 List **indirect = head 後,此時會讓指標 indirect 指向指標 head,就是指標的指標。

注意,傳入函式時 remove_list_entry(&head, entry) 已經是傳入指標 head 的位址,所以這裡跟原先的虛擬碼不同,並非 List **indirect = &head

// Walk the list, looking for the thing that
// points to the entry we want to remove
while ((*indirect) != entry)
    indirect = &(*indirect)->next;
  1. (*indirect):用取值(dereference)運算子 * 拿到指標所指向的節點
  2. (*indirect)->next:得到 next 欄位所儲存的下一個節點
  3. &(*indirect)->next:用取址(reference)運算子 & 取得 next 欄位的位址
  4. indirect = &(*indirect)->next:將 next 欄位的位址存回 indirect
// ..and just remove it
*indirect = entry->next;

如果 *indirect 是要刪除的節點 entry,則指標 indirect 指向前一個節點中的 next 指標(指標的指標),所以 *indirect 設為 entry 的下一個節點,即 *indirect = entry->next

data-model-indirect

看一下記憶體的位址可能會比較好理解:

node 1: 0xa0
node 2: 0xc0
node 3: 0xe0

-------- p ---------
*p:             0xa0
(*p)->next:     0xc0

------ p -> p' -----
&(*p)->next:    0xa8

-------- p' --------
*(&(*p)->next): 0xc0
(*(&(*p)->next))->next: 0xe0

Conclusion

I don’t want you to understand why it doesn’t have the if statement. But I want you to understand that sometimes you can see a problem in a different way and rewrite it so that a special case goes away and becomes the normal case, and that’s good code. — Linus Torvalds

Linus 透過這個例子是要跟讀者說明,有的時候只要我們換個角度去思考問題,或許可以將其中的特殊情況簡化為普通問題,而這就是好的程式。

Code Example

注意,此程式一樣沒有考慮記憶體釋放、串列為空或節點不存在等問題。

#include <stdio.h>
#include <stdlib.h>

typedef struct list {
    int data;
    struct list *next;
} List;

void insert(List **head, int data) {
    List **indirect = head;
    List *tmp = malloc(sizeof(*tmp));
    tmp->data = data;
    tmp->next = NULL;

    while (*indirect != NULL)
        indirect = &(*indirect)->next;

    *indirect = tmp;
}

void print(List *ptr) {
    while (ptr != NULL) {
        printf("%d ", ptr->data);
        ptr = ptr->next;
    }
    printf("\n");
}

void remove_list_entry(List **head, List *entry) {
    // The "indirect" pointer points to the
    // *address* of the thing we'll update
    List **indirect = head;

    // Walk the list, looking for the thing that
    // points to the entry we want to remove
    while ((*indirect) != entry)
        indirect = &(*indirect)->next;

    // ..and just remove it
    *indirect = entry->next;
}

int main() {
    List *head;
    insert(&head, 1);
    insert(&head, 2);
    insert(&head, 3);
    insert(&head, 4);
    insert(&head, 5);

    List *rm2 = head->next;
    List *rm4 = head->next->next->next;

    print(head);

    remove_list_entry(&head, head);
    print(head);

    remove_list_entry(&head, rm2);
    print(head);

    remove_list_entry(&head, rm4);
    print(head);

    return 0;
}
$ gcc -o goodtaste goodtaste.c && ./goodtaste

1 2 3 4 5
2 3 4 5
3 4 5
3 5

Reference